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(57) Abstract 

The present invention relates to four novel human genes amplified and overexpressed in breast careinoma and located on the ql 1- 
q21.3 region of chromosome 17. The four novel genes are useful in breast cancer prognosis. The present invention also relates to a fifth 
novel human gene expressed in breast cardnoma and located on chrranosone 6q22^q23. A sixth novel gene is also described that is the 
murine hcunolog of the human D52 gene. The genes and gene fiagments of the present invention are diemseWes useful as DNA and RNA 
probes for gene inq>ping l)y in situ hybridization widi chromosomes and for detecting gene expression in human tissues (including breast 
and lymph node tissues) by Northern blot analysis. 
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Isolated Nucleic Acid Molecules Useful as Leukemia Markers 
and in Breast Cancer Prognosis 



Field of the Invention 

The invention rdates to four novd human genes amplified and 
S ovei^ressed in hreast cardnonuL The four genes are located at chromosome 

17ql l-q21.3. The invention also rdates to a fifth novel human gene expressed in 
breast carcinoma and located at chromosome 6q22-q23. A dxth novel gene is 
also described that is the murine homolog of the human D52 gene. 

Background of the Invention 

1 0 Des|nte eailier detection aiiid a lower aze of the primary tumors at the time 

of diagnosis (NystrSm, L. eial.. Lancet 3^:973-978 (1993); Fletcher, S.W. et al., 
J. NaiL Cancer InsL 95:1644-1656 (1993)X assodated metastases remain the 
mqor cause of breast canco* mortality ^rost, P. & Levin, R., Lancet 339: 1458- 
1461 (1992)). The initial steps of transformation dutfactoized by the maligmnt 

IS cell esc^e firom normal ceH cyde controls are driven by the expression of 

dominant oncogenes and/or the loss of tumor suppressor genes (Hunter, T. & 
Pines, J., Cell 7P:573-582 (1994)). 

Tumor progression can be considered as the ability of the malignant cells 
to leave the primary tumoral site and, after migration through lymphatic or blood 

20 vessels, to grow at a distance m host tissue and form a secondary tumor (Fidler, 

L J., Cancer Res. 50:6130-6138 (1990); Liotta, L. et al. Cell 6^:327-336 (1991)). 
Progression to metastaas is dq^endent not only upon transformation but also upon 
the outccmie of a cascade of interactions between the malignant cdls and the host 
cells^ssues. These mteractions may reflect molecular modification of synthesis 

2^ and/or of activity of (fiflferent gene products both in malignant and host cdls. 
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Several genes involved in the control of tumoral progression have been identified 

and-shown to-be inipU(ated in ceU adhesion^-extraceUdar m 

immune survdDance, growth fector synthesis and/or an^ogenesis (reviewed in. 
Hart, LR. & Saini, A., Lancet J3P: 1453-1461 (1992); Ponta, H. etaL, B.BA. 

MO (1994); Bernstein, L.R, &Liotta,L. A., Curr. Opm. OncoL 5:106-113 
(1994); Biattain, M-G. eioL, Curr. Opin. OncoL tf:77-81 (1994); and Fidler, LI 
& Ellis, L.M., Cett 7P:18S-188 (1994)). 

However, defimng the medianisms involved in the formation and growth 
of metastases is still a mqor diallenge in breast cancCT researdi (Rusdano, D. & 
Burger, M.M., BioEssays 7-/:185-194 (1992); Hosldns, K. & WAer, Bi., 
CwT^C^nion in Oncology 6:554-559 il994)y The processes leading to the 
formation of metastases are complex (Hdler, LX, Cancer Res. 50:6130-6138 
(1990); Liotta. L. et aL, CeU «:327.336 (1991)), and identifying the related 
molecular events is thus critical for the sdection of optimal treatments. 

Summary of the Invention 

By dififerential screening of a cDNA library fiom breast cancer derived 
metastatic axillary lymph nodes, four clones (^dLN 50, 51, 62 and 64) were 
isolated by the present inventors and determined to be co-localized at the qll- 
q213 r^n of tiie chromoscmie 17 long arm. Several gmes implicated in breast 
cancer progresaon have been assigned to the same portion of chromosome 17, 
most notably the oncog»e c-£r&B-2 in ql2 and the recently cloned tumor 
suppressor gene BRCAl in q21. Additionally, tiie D53 gene was doned by tiie 
present inventors fix>m a cDNA library of primary infiltrating ductal breast 
carcinoma using a expressed sequence tag that was identified to be homologous 
to the previously idoitified D52 gene, and the D53 gene was localized to 
chromosome 6q22-q23. 

The four MLN genes of the present invoition are usefiil as prognostic 
marlcOT for breast canca^. Although no group of the art-known prognosticators 
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completely fiilfills the objective to fully distinguish high- and low-risk patients, 
-combinations of the prognostic &ctoi^ can improve the prediction of a p^ 
prognosis. Thus, by the invention, fiirther prognostic nuuicers are provi^^ 
can be added to the population of art-known prognosticators to more particularly 
distinguish between lugh- and low-risk breast cancer patients. By the invention, 
vfhm compared to MLN 50, SI, 62, or 64 gene expression level or gene copy 
numbo- in non-tumorigenic breast tissue, enhanced MLN SO, S 1, 62, or 64 gene 
expression level or g^e copy number in breast cancer tissue is incUcative of a 
U^-risk breast cancer patient. 

The invention fiirther provides a method for distingiushing between 
dfiferent types of acute mydoid leukemia, which involves assaying leukmua cells 
for D52 or D53 gene expression; whereby, the presence of D52 transcripts 
(mRNA) or protdn or the lack of DS3 mRNA or protdn indicates that the 
leukemia cdls have myelocytic characteristics (such as HL-60 cells) and the 
presence ofD53 mRNA or protein or the lack of D52 mRNA or protdn indicates 
that the leukemia cells have erythroid characteristics (such as K-562 cells). 

Also provided are isolated nucleic acid molecules encoding MLN SO, SI, 
62, 64, DS3, or murine (m) D52 polypeptides whose amino add sequences are 
shown in Figures 14, 21(A-D), 6, 16, 24(B) and 2S^), respectively. In anotiier 
aspect, the invention provides isolated nuddc add molecules mcocfing MLN SO, 
51, 62^ 64, or D53 polypqjtides hamng an amino add sequence as encoded by the 
cDNAs deposited as ATCC Deposit Nos. 97608, 97611, 97610, 97609 and 
97607, respectively. Further embodiments of the mvmtion include isolated nuddc 
add molecdies that are at least 90% and preferably at least 95%, 97%, 98% or 
99% identical the above- described isolated nucldc add molecules of the present 
invention. 

The preset invention also relates to vectors which contain the above- 
described isolated nuddc add molecules, host cells transformed with the vectors 
and the production of MLN 50, 51, 62, 64, mD52 or D53 polypeptides or 
fragments thereof by recombinant techniques. 
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The present invention further provddes an isolated MLN 50, 5 1, 62, 64, 

D53 cMlnD52 polj^^de havi^^ amind add sequmce as shown in~F^ 7 

21(A-D), 6, 16, 24(B) or 25(B), respectively. In a further aspect, an isolated 
MLN SO, 51, 62, 64 or D53 potyp^tide is provided having an amino add 
5 sequence as encoded by the d[)NAsdq>osited as ATCC Deport Nos. 97608, 

97611, 97610, 97609 and 97607, respectivdy. 

Bri^ Description of the Figures 

Figure L Expression Anafysis of the 10 MLN Genes. Northern blots 
10 contamed 10 ng of total RNA isolated from MLN (lanes 1), NLN Oanes 2) and 

FA Oanes 3). Five filters have been prepared and each of them was successively 
hybridized using two MLN cDNA probes (MLN 62 and 50; MLN 74 and 51; 
MLN 19 and 64; MLN 10 and 137; MLN 4 and 70) and tfie internal loading 
control 36B4. rRNA size markers (S values) are indicated (left). 

15 FigunZ ChromosomalAssignmentofMLNS0,Sl,62and64Genes 

by in iSte HyM^fltion. (A) Uiogram of the human G-banded chromosome 17 
iUustrating the ^stribution of labded dtes for MLN 50, 51, 62 and 64 d)NA 
probes. (B) Imitative relative asdgnment of the MLN genes within the qll^ 
n^on of the loiig arm of the chromosome 17. 

20 Figures. Bqms5ionAnafysisqfAiLNS0,Sh62and64GenesA 

Breasi Cancer Cdl Lines. Ten ^g of total RNA from breast cancer cell lines 
were loaded in eadi lane. Hybridizations were carried out successively with 
probes corresponding to MLN 50, 5 1, 62 and 64, Control hybridizations were 
performed with MLN 19 (c^r6B-2), p53 and 36B4. Approximate sizes of the 

25 mRNAs are indicated in kb (right). , 



wo 97/06256 



PCT/US96/12500 



-5- 



10 



Figure4. NaraiemElotAnafy^ofCARTlmRNAinHunumBr&^ 



Fibroadenomas, Carcinomas and Lymph^No^^^^istase£ Each" lane 
contained 10 |ig of total RNA. From left to ri^t, RNA samples fix>m breast 
fibroadenomas (FA, lanes 1-Q, carcinomas (BQ lanes 7-16) and m^astatic lymph 
nodes (MLN, lanes 17 and 18) were loaded. Hybridization was carried out u^ng 
^cDNA probe for CARTl. A 20004)ase long CARTl transcript was expressed, 
at various levels, in some cardnbmas (lanes 7, 1 1 and 13X and in one metastatic 
y sample (lane 17). The 36B4 probe (Masiakowski, P. ol, ^ci& 1^ 
■70:7895-7903 (1982)) was used as positive internal control. Autoradiogr^hy 
was for 2 days for hybridization of CARTl, whereas 36B4 hybridization was 
lexposed for 16 hrs. 



Figure 5. In Situ Hybridization ofCARTl mRNA in Human Breast 
Carcinoma and Axillary Lyng^h Node Metastasis. Sections of normal breast 
(A), in situ caiaDOTm (C), invaave caranoma (B) and metastatic lymph node (D) 

15 werehybridizedwthantisense^'S IWA probe spedfic for CARTl. CARTl was 

strong)|y esqnessed m the tumoral epithelial cells, wh^eas the stromal part of the 
tumor was totaDy negative (B). CARTl transcripts were homogoieously 
distributed througihout the poative areas (B-D). Normal ducts were devoid of 
CARTl signal (A). No agnificant labeling above bad^round was found vAiea 

20 umig sense human CARTl RNA probe (data not shown). Bright fidd (A-D). 

Figured Nudeatide and Anuno Add Sequences of Human CARTl. 
Nucleotide sequence (SEQ ID NO:l) is numbered in the 5' to 3' direction and 
anm) add sequence (SEQ ID NO:2) in the open reading firame is designated by 
the one letter code. The underlined nucleotide sequences correspond to the 

25 Kozak and poly(A) addition agnal sequences. Putative NLS sequences are bold- 

typed and broken underlmed. The two C-rich regions are boxed and H and C 

^ residues are bold-typed. Restricted TRAF domain is grey-boxed. Arrow-heads 

indicate the splidng sites and asterisk the stop codon. 



wo 97/06256 



PCT/US96/12500 



-6- 

Figure 7. Prinuay Structure of the CARTl C3HC3D Motif and 
Cotriparison ^nik RmG Hiiger Protdns-from Various-Swedes. —These 
sequences are aligned to each other uang the PileUp program (Feng, DJ^. & 
Doolitae; ICF., J. MoL EvoL 25:351-360 (1987)). Bracket numbers indicate the 
respective pootion of the motif in eadipiotda Readues identical in aU sequences 
are bold-typed, and the conservative residues (R/K; W/L; Y/F; D/E; N/<^, SIT) 
aregrejMxnGcd. Gaps are used to optimize alignment: J7amo (CARTl (SEQ 
E>NO:2XRINGl (SEQIDN0:13XBRCA1 (SEQ]DN0:14). CD40bp(SEQ 
ID NO: 15). SS-AARo (SEQ ID NO: 16), MELI8 (SEQ ID NO:17)); M, Uis 
CrRAE2(SEQlDNO:18),RPT-l (SEQlDNO:19));X, j&iK?«w(XNF7 (SEQ 
ID NO:20)); D, DrosaphOa (SU(z)2 (SEQ ID NO:21)); S, Sacdummyces 
(RAD18 (SEQ ID NO:22)); D, DictyosteUum 0OG17 (SEQ ID NO:23). 

fig^ & Pattern of Avail Digestion of the FuO-Latgth CARTl cDNA 
(A) Postions and sequence of AvdH. ates (bold-typed) in the full-length CARTl 
cDNA (SEQ ID NO: 1). Corresponding protdn sequence from residues 54 to 60 
of SEQ ID NO:2 is indicated uang one letter code. D is bold-typed. (B) 
EtUdium bromide staimng of gd electrophoresis of the CARTl AvcM digest 
Molecular weight (m.w.) and CARTl fragments azes are given on the left and 
right sides, respective^. 

Figure 9. Primary Structure tfOie Three (higiualHC3HC3 C-ritsh 
Motifs Present in CARTl and Comparison v>mt Those ofCD4a-bp, TRAF2 
andDGlJ. ABenment and conventional symbok are as described in Ae Fi gure 7 
legend above: CARTl (101-154) (SEQ ID NO:2); CARTl (155-208) (SEQ ID 
N0:2); CARTl (209-267) (SEQ ID N0:2); CD40bp (134-189) (SEQ ID NO:24); 
CD40bp (190-248) (SEQ ID NO:25); TRAF2 (124-176) (SEQ ID NO:26); 
•IRAE2 (177-238) (SEQ ID NO:27); DG17 (193-250) (SEQ ID NO:28). 
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Figure 10. Primary Structure of the Restricted TRAP Motif and 

Qpmparison unih Those of CD40^bprT^ 

conventional symbols are as described in the Figure 7 l^end above. Consensus 
sequence (SEQ ID NO:32) is indicated for CARTl (308-387) (SEQ ID N0:2X 
5 CD40bp (415-494) (SEQ ID NO:29), TOAFI (260-339) (SEQ ID NO:30), and 

TRAF2 (352-431) (SEQ ID NO:31). Consensus sequence (SEQ ID NO:36) is 
indicated for CARTl (388-470) (SEQ ID NO:2X CD40bp (495-567) (SEQ ID 
NO:33), TOAFl (340-409) (SEQ ID NO:34), and TRAF2 (432-501) (SEQ ID 
NO:35). 



10 Figure IL Organization of the Human CARTl Gene and Protein. 

Schematic rq)resentation of the CARTl gene exon/intion organization. Exons are 
numbered fix>m 1 to 7. The correspondence between DNA coding sequences and 
protein domains are indicated (B, BaniHL; ORF, open reading frame; UTR, 
imtranslated r^on). 

15 Figure IZ Comparison of CARTl, CD40-bp, TRAF2 and DG17 

PMein Structural Organizadon. The size and position of RING finger, CART 
moti^ a helbc and restricted TRAP domain are rqnresented for each of these 
proteins, Ughlighting the similarity of thdr protein organization. 

Figure 13. Northern Blot Analysis of Lasp-J mRNA Expression in 
Human Tames. (A) Total RNA (lO^g) extracted fiom breast-draved metastatic 
hftnph node (lanes 1 and 2), breast carcinomas (lanes 3-12), fibroadenomas (lanes 
13-17) and breast hyperplasia (lane 18) were loaded, transferred, and l^ridized 
with ^P-labeled probes specific for c-er6B-2, Lasp-1 and to the RNA loading 
control 36B4. Approximate transcript sizes are indicated (right). (B) Total RNA 
extracted from normal lymph node (lane 1), normal sldn (lane 2), normal lung 
(lane 3), normal stomach (lane 4), normal colon (lane 5), normal liver (lane 6), 
SK.Br-3 (lane 7), BT-474 (lane 8) and MCF.7 (lane 9) w^ loaded, transferred. 



20 



25 
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and hybridized with "P-labded probes spedfic for c-erhB-2, Lasp-1 and to the 
~RNA loading coiitror36B4rApprd)du£^ 

Figure 14, NudeoOde and Amino Add Sequences qfHunum Lasp-L 
(A) Nucleotide sequence (SEQ ID NO:3) and amino add sequence (SEQ ID 
N0:4) of human Lasp-1. Nudeotides and amino add readues are numbered on 
the left and ri^t, reqiectivefy. The consoisus readues invohred in the LIM 
domain are underlined and bolded and in the SH3 domain re^xdded. Putative 
tyrosine readues in tyiiosine kinase phosphoi^Mon are underling Anastoisk 
denotes the tenmnationoodcKL Theagnalforpolyadeiq^onisimderlined. (B) 
Structure ofLasp-1 d>NA. The diaded box indicates the protdn-coding re^on. 
The poation of the different expressed sequoices tags with homology to Lasp-1 
are indicated with thdr corresponding length and accession numbers. 

Figure 15. Conqtarison of the Lasp-1 UM and SH3 Domains with 
OOterPrtOeins. (A) Comparison of Lasp-1 LIM domain (residues 1-51 of SEQ 
ID NO:4) with other LIM proteins: YLZ4 (1-51) (SEQ ID NO:37); hCRIP (1- 
55) (SEQ ID NO:38); rCRP2 (1-56) (SEQ ID NO:39); iCRP2 (119-180) (SEQ 
ID NO:40); TSF3 (5-64) (SEQ ID NO:41); TSF3 (104-162) (SEQ ID NO:42)). 
The consenais LIM domain residues are bolded, identical residues are dashed, (.) 
indicates gaps in the alignntenL (B) Comparison of Lasp-1 SH3 domain (residues 
196-261 of SEQ ID NO:4) Twth other protons: YLZ3 (134-200) (SEQ ID 
NO:43); EMSl (486-550) (SEQ ID NO:44); ABPl (526-592) (SEQ ID NO:45); 
Mfyn (76-141) (SEQ IDNO:46); h/ac (78-144) (SEQ ID NO:47); h/frg (71-135) 
(SEQ ID NO:48); h/yes (85-152) (SEQ ID NO:49). The identical residues are 
dashed, conserved or soniconserved readues in more then half or the aligned 
sequences are bolded, (.) indicates gq)s in the alignment. 

Figure 16, Nudeodde and Amino Add Sequences of Human MLN 64. 
Nudeotide sequence (SEQ ID N0:5) is numbered in the 5' to 3* direction and 
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amino add sequence (SEQ ID N0:6) in the open reading fiame is designated by 
the oneiettw coder ~The undeilined nucleodde sequences correspond to the — 
Kozak and poly(A) addition signal sequences. The dashed underiined nucleotide 
sequences correspond to the sequences which could be deleted; 0 new splicing 
Site after ddetion; i stes of inscftions. Synthetic peptide sequmce is bold-typed. 
Arrowheads incficate the splidng sites and asterisk the stop codon. 

Figure 17. OrganizffAon of the Hunian AOJf 64 Gene and PratdiL 
Schematic representation oftheNfU^ 64 gene exon/intionor^^ Exons 
are numbered firom 1 to IS (hatdied and open boxes for coding and noncoding 
10 exons, respectively). Arrows indicate the nudeotide substitution, exon deletion 

and intron insertion sites (a: exon 2, C7T substitution, b: exon 2, 137 bp 5* eid 
ddetion, c: exon 4, A/G substitution, d: exon 4, 13 bp 3* end ddetion, e: intron 
6, 199 bp S' end insertion, f: complete exon 7 deletion, g and h: intron 9, SI bp 
and 6S7 bp S* end insertion). 

FigurelA Northern BiaiAnafysis of MLN 64 mRN4 in Hi 
Fibroadenomas, Cardnomas and Lymph Node Metastases. Eadi lane 
contained 10 |ig of total KNA. From left to right, KNA samples from breast 
fibroadenomas (lanes 1-6), cardnomas Oanes 7-14), normal lymph nodes (bnes 
IS and 16) and metastatic lymph nodes (lanes 17 and 18) are loaded. 
Hybridization was carried out using ^cDNA probe for MLN 64. A2000-base 
long MLN 64 transcrqit is expressed, at various levds» in some cardnomas Qznes 
6, 10 aid 1 IX and in the metastatic samples Oanes 16 and 17). The same pattern 
of expresdon was observed udng an etbB-2 probe. The 36B4 probe 
(Masiakowsid, P. ei aL, NucL Acids Res, y0:789S-7903 (1982)) was used as 
positive internal control. Autoradiography was for 2 days for hybridization of 
MLN 64 and erbB-2, whereas 36B4 hybridization was exposed for 16 hrs. 
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Figurel9. InSiiuHybri£zathnofAtLN64mRNAinHunumB^ 
Carcinoma and AjoBoiy Lymph Node Metastasis. Sections of normal breast 
(AX in situ cardnoma (C^ invasive caidnoma (B) and metastatic lymph node (D) 
were hybridized with amiso^ MLN64is 
stroi^ expressed in the tumoral epithdial cells, vrfiereas the stromal part of the 
ti]mori5totaIIyn%ative(B). MLN 64 transcrqsts are homogeneously distributed 
tiiTDu^xnittfie positive areas (B-D). Normal ducts are devoid of MLN 64 agnal 
(A). No ^gmficant kd>ding above background was found whwi using sense 
, hriman MLN 64 KNA probe (data not shown). Bright field (A-D). 

Figure 20. Immunohistochemistry of Human Breast Carcinoma and 
\ AjdUary Lymph Node Metastasis. Sections of normal breast (A), m situ 
cardnoma (C). invaave carcinoma (B) and metastatic tymph node (D) w«e 
studied for the presence of MLN 64 proteui, using a monodonal antibody (see 
Materials and Methods). MLN 64 is stron^ye)q)ressed in the tumoral epithdial 
cells, whereas the stromal part of the tumor is totally negative (B). MLN 64 
protdn was located in cytoplasmic bundles like structures ^D). Normal ducts 
are dev(»d of MLN 64 stammg (A). 

Figure 21 (A'D). Nucleaddeand Amino Add Sequences of Human 
MLN 5h Nudeotide sequence (SEQ ID N0:7) is numbered m the 5' to 3' 
direction. The length of the sequence is 4253 bases and inchides an additional 
untranslated 233 nudeotides on the 5* end Amina add sequCTce (SEQ ID N0:8) 
is numbered m die S' to 3' direction (underneath). The length of the sequence is 
534 amino adds. 

Figure 2Z Alignment of Expressed Sequence Tags (ESTs) with 
Homology to the CARTl cDNA Sequence. Nme ESTs with homology to part 
of the CARTl nudeotide sequence were identified in GenBank. The accession 
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number and alignment relative to the CARTl gene are indicated. The CARTl 
ORFisboxed."^ 

Figure 23. Alignment of Expressed Sequence Tags (ESTs) with 
HamdogyioAeMLN 51 cDN4 Sequence. Three ESTs with homology to part 
of the MLN 51 nudeotide sequence idoitified in GenBank. The acces^on 
number and alignn»it relative to the MLN 51 gene are tn£cated. 

Figure 24 (A)'(B). Diagrammatic Bepreseniation of 3 hD53 a>NAs, 
(A) Diagrammatic representation of 3 hD53 cDNAs, with clones 83289 and 
1 16783 repres^iting cDNAs isolated by the Washington Umv^ty-Merck EST 
project, and done Ul representing a dSNA isolated from the human breast 
cardnoma cDNA library during tins study. Shaded re^ons indicate 5 -UTR 
sequence, solid r^ons indicate coding sequence and open regions incficate 
3 -UTR sequence. The polyadenylation signals assodated with polyA sequoices 
are indicated, as is a clone 83289 deletion, and an Alu sequence in the 3 -UTR of 
done 83289. (B) Nudeotide sequence (SEQ ID NO:9) and anuno add sequence 
(SEQ ID NO: 10) determined for the hD53 Ul cDNA. The predicted coding 
sequenceistranslateduangtheoneletter code O^boldX with numbering in italics 
referring to the translated product, and all oth^ numbering referring to the 
nudeotide sequence. \Vltim the 3 -UTR, the polyadenylation ^gnal (ATTAAA, 
nucleotides 1308*1313 of SEQ ID N0:9) is shown underlined and in bold, as is 
the corresponding ^e of polyA addition (nudeotide 1325). 

Figure 25 (A)^(B). Diagrammatic Representation of Two mD52 
cDNAs. (A) Diagrammatic r ep r e s entation of two mD52 dONAs isolated from the 
apoptotic mouse mammary gianddDNA library. Shaded regions indicate 5 -UTR 
sequence, solid regions indicate coding sequence and open regions indicate 
3'-lJTR sequence. The polyadenylation signals assodated with polyA sequences 
are indicated. (B) Nucleotide sequence (SEQ ID NO: 11) and amino add 
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sequence (SEQ ID N0:12) deterniined for the inD52 CI cDNA. The predicted 
. .opdn« sequence is translated using the one letter 
in italics referring to the translated product, and all other numbering referring to 
the nucleotide sequence. Within the 3'-UTR, two polyadenylation signals 
(ATTAAA, nucleotides 976-981, and AATAAA, nucleotides 2014-2019, both of 
SEQ ID NO: 1 1) are shown underiined and in bold, as are the cone^Mmding sites 
of polyA addition (nucleotides 1012 and 2033 of SEQ ID N0:1 1). 

Figure 26 (A)-(B). ABgnment of mD52, hD52 and kDSX (A) 
Afignment of inD52 (SEQ ID N0:12X hD52 (SEQ ID NO:50) and hD53 (SEQ 
IDNaiO) amino add sequences, shown usiqg the one-letter code, as produced 
by the program PikUp. Numben above and bdow the sequences refer to amino 
add pootions in mD52 and liDS3^ respectwely, with numbering being idoitical for 
the 3 sequences up to residue 127, and for hD52 and mD52 up to residue 171. 
Vertical lines and colons mdicate readues identical or conserved, respectivdy, in 
mD52 and hD52, and/or in hD52 and hD53 sequences. The following 
substitutions were allowed: MILVA, GA, DE, TS, QN, YFW, KKH The 
condnned linuts of the N-teiminal PEST domains (Lys^-Arg*" in niD52, Ai^- 
Aig*in hD52, and Met'-Lys" in hD53), cofled-coU domains (Ghi^-Leu" mD52, 
Ala^-Leu" m hD52 and Val^-Leu" in hD53), and C-tenninal PEST domains 
(Lys^-Pro»" in niD52. Lys»«.Lys in hD52 and Lys "In hD53) are 
indicated above the sequences. In addition, potential sites of N-glycosylation 
(Asn"' and Asrf" in mD52. Asn^ in hD52, and Asn« in hD53) are shown 
underiined and m bold. Potential sites of phosphoiylation by casein H kinase 
(Ser*, Thr°, Thi^, Ser«, Se^ in mD52; Sep*, Thr", Ser«, Ser** Sei~, Thr"» in 
hD52; Thr". Sei«, Ser«. Ser«, Ser«», Ser"t Thr'^in hD53X protdn kinase C 
CThi^, Thr» m mD52 and liD52; Tlir«, Ser«, Ser*", Ser«, Thi»«, SeH"; SeH« in 
hD53), cAMP- and cC^-dependent kinase (Ser** m mD52 and hD52). and 
tyrosine kinase (Tyi" in hD53) are all shown in bold. (B) The aligned coOed-coU 
domains identified in mD52 (SEQ ID NO: 12), hD52 (SEQ ID N0:5 1) and hD53 
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(SEQ ID NO:10) sequences, shown usmg the one-letter code. Numbers bdow 
the sequences refer to anuno add positions in the 3 sequences. The dbcdejg 
heptad rqpeat pattern is indicated above the sequences, with po^ons a and d 
(frequently occupied by hydrophobic amino adds in coiled-coil domains) diown 
in bold, and po^ons e and g (frequently occupied by negativdy and positively 
chaiged amino adds, lespecthdty) are underlined. Where mDS2, hDS2 and hDS3 
sequences are in accordance with this consensus, the rdevant readues are 
correspondingly shown in bold or underlined. 

Figure 27 (A)'(B). (A) Ideogram of the human G-banded chromosome 
6 iUustrating the cfistribution of labeled sites with the 1 16783 hD53 probe. (B) 
Localization of the mD52 gene to mouse chromosomes 3 and 8 by m situ 
hybridizatioa Ifegrams of WMP mouse Rb (3; 12) and Rb (8; 9) chromosomes, 
indicating the distributions of labded dtes on chromosomes 3 and 8. 

Figure 28. The Effects of Estradiol Treatment on hD52 and 
Transcript Leveb in Human Breast Candnoma Celt lines. Northern blot 
analyses were performed using 10 )ig total KNA for eadi sanqple. The identity 
and aze^ parenthesis) of each transcript is indicated to the right of each panel, 
whereas tiie oone^ndii^ duration of autoradiographic exposure is shown on the 
left. For eadicdl line, lane 1 indicates totdKNA from cells grown for 6 dsiys in 
normal media (see Materials and Methods), lane 2 indicates total KNA from cells 
grown for 1 day in normal media and for 5 days in phenol red-free DMEM with 
10% steroid-dq>leted FCS and 0.6 ^g/ml insulin, lane 3 is as for lane 2 except that 
for die last 3 days of culture, media were supplemented with 1 0"^ M estradiol, and 
lane 4 is as for kne 2 except that for the last 3 days of culture, media were 
supplemented wth lO"* M estradiol. ER+/ER- indicates the presence/absence of 
the estrogen receptor in the cell Iine(s) shown below. The hD52 and hD53 
transoipts were co-expressed in the 3 cdl lines, and transcript levels for both 
genes were similarly affected by estradiol stimulation/deprivation in MCF7 cdls. 
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and were not affected by Ae same treatments mBT-^^ Diflfeiing effects on 

liD52 and KDS3-transmptlevds-were noted m the e9q>er^ 

The estrogen-indudble pS2 gene was used as a control for the effectiveness of 
estradiol supplementations/deprivations. As expected, the presence of estradiol 
S induced pS2e3q>resaon in ER+cdl lines, but not in the £R+ceDIu^ For 

all 3 cdi lines^ similar results were obtuned in at least one other experiment 
performed on a separate occastoa 

Figure29. The^eclsofTPATreaimaaonKDSlmhDSS Tram^ 
Lewis m Human Leukoma CdllJnesi Northern blot anafyses were poformed 
usi^g 10 total RNA for eadh sample. The identity and size On parenAeals) of 
each transcript is indicated to the right of each pand, i^Aereas the corre^onding 
duration of autonufiogr^hic exposure is shown on the left. Lanes marlced (C) 
indicate total KNA from cdls grown in normal mecfia {see Materials and 
Methods), lanes marlced (16) indicate total RNA from cells grown in media 
supplonented with 16 nM TPA, and lanes marked (160) indicate total RNA from 
cells grown in media supplemented with 160 nM TPA. Tunes shown above the 
lanes indicate when cells were harvested after the start of each experiment. (A) 
TPA treatment of HL-60 cells was found to decrease hD52 and transfmin 
receptor(TR)trBnscriptlevdsafiar 18 hrs TPA treatment. hDS3 transcripts were 
not detected in HL-60 cdls. Similar results were obtained in at least one otho* 
experiment performed on a sq>arate occa^on. (B) TPA treatmrat of K-S62 cells 
was found to deaease hD53 and transferrin receptor (TR) transcript levels uSber 
24hr5TPAtreatmait. hD52transaipts were not detected in K-S62cells. 

Figure 30. Southern Blot Anafy^ of Three Representative Breast 
25 Cancer Tumor DNAs withAmpBfications of Chromosomal R^n 17qll^2L 

(L) and (1) indicate matched Taql-dtgested DNA samples isolated from peripheral 
leukocytes and tumor tissue, respectively. Hybridizations were carried out 
successively with probes MLN 50, 51, 62, 64 and ERBB2. Case 309 shows 
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ampGficatioiis fcH- MLN 62, ERBB2 and MLN64. Case 1191 shows amplification 
—for only MLN 62; Case 15 12 shows amplifications for ERBB2 and Nl^ 

Figure 3L Hqll-^ll An^Ucon Maps in Hunum Breast Cancer, lin^ 
ccHTe^nd to each tumor sample, columns to each marker. The denatometrica% 
determined gene dosages (amplification levds) were subdivided into fisur 
categories. Wlnte boxes lepresem a nonnal copy nuniba; shaded boxes 2-5 times 
\^ anqifification, daik shaded bo7^ 6-10 times amplification, and black boxes > ten 
times anq)lification. The lod finom 17qll-q21 are ordered accordmg to thdr 
d^mo$omal location, firom the most centromeric locus (MLN 62) to the most 
^tdomeric locus ^iLN 51). 

Figure SZNaOiern Blot AnafysisafMLN SO, 51, 62, M 
in Nomud and Tumorai Breast Tissues, Nl and N2, normal breast tissues; 
T309, Tl 191 and T1512, breast tumor tissues. Hybridizations were carried out 
successively with probes MLN 50, 5 1, 62, 64 and ERBB2. Control tq^ridizations 
with the 36B4 probe showed that ^milar amounts of mRNA were loaded in each 
case. Right, iq)proximate dzes of the mRNAs are indicated in kb. Case 309 
^ws ovrnxpressions for MLN62, ERBB2 and MLN64, compared with normal 
breast tissues. Case 1191 shows overexpression for only MLN62. Case 1512 
^ows overe)q>res^ons for ERBB2 and MLN64. 

Detailed Description of the Invention 

Isolation and Localization of Six Novel Genes, MLN 50, 51, 62, 
64,D53andmD52 

The present inventors have identified four genes, co-Iocalized on the long 
arm of chromosome 17, which are amplified and overexpressed in malignant 
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breast tissues. In order to identify and done these genes involved in tumor 
progression, differaitial screening of a cDN^ 

metastatic axillary lymph nodes was performed. The method invoh^ed screening 
the MLN cDNA libraiy using two probes represmtadve of malignant (MLN) and 
no nm a li g n a n t (fibroadenomas; FA) breast tissues. FAs woe selected as control 
tissues ance, ahhou^ nonnudignant, they are proliferating tissues^ therd>y 
nmmmzing the probalrifity to identify niRNAs diaracteristic of cellular growth, but 
unrelated to tiie malignant process. The(E£ferentialscreemngmetiiodise9q>ldned 
in detail inExanq[>le 1, bdow, and in Basset, P. etal. Nature 348:699-704 (1990), 
i?Aerc it is described as alloM^g idendfication of the 5trom^an-3 gene (see also, 
U.S. Pat. No. 5,236,844). 

Four differential clones (MLN 50, 51, 62 and 64) were isolated which 
correspond to cDNAs whose sequences do not belong to any previously 
characterized gene or protein fimily as d^ermined by comparison to the combined 
Geiidianlc^EMBL databanks. Byinmtu hybridization of metaphase cells, the four 
new genes of the present invention were determmed to be co-located to the ql 1- 
q2L3 re^oa of the chromosome 17 long arm. Several goies implicated in breast 
cancer progression have been assigned to the same portion of chromosome 17, 
most notably tiie oncogcaie c-eriB-2 in ql2 (Fukushige, S.L et aL, MoL CelL 
BioL 5.-955-958 (1986)) and the recentiy doned tumor suppressor gene BRCAl 
in q21 (Hall, J.M. et oL, Science 250:1684-1689 (1990); and Mild, Y. et aL, 
Science 266:66-71 (1994)). According to thdr chromosomal assignments, the 
present inventors mapped the four novel genes proximal ^ILN 62 and 50) and 
distal ^dLN 64 and 51) to the c-er6B-2 gene, and proximal to the BRCAl gene. 

It has been shown previously that multiple chromosome segments on the 
chromosome 17 long arm are targets for amplification in breast tumorigenesis 
(Muleris, M. et oL, Genes ChronL Cancer 70:160-170 (1994); Kallioniemi, A 
etaL.Proc. NatL Acad &/. USA 9/:2156-2160 (1994)), and 17ql2 was found 
to be the most commonly amplified diromosomal band-re^on (Guan, X Y. et al , 
Nat. Genet 5:155-161 (1994)). Consistentiy, in breast cancers, c^r6B-2 
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overexpresaon is most often corrdated to gene amplification (Slamon, DJ. et al^ 

-—Science 255:177-182 (1987); van de^^^ 

2023 (1987)). 

It is assumed in the art that DNA amplification plays a crudal role in tumor 
progression by allowing cancer cells to upregulate numerous gmes (KaUioniemi, 
A etaL^ProcNatL Acad. Set USA 97:2156-2160 (1994); Lonn, U. etaL, Intl 
J. Cancer 58:40-45 (1994)). Amplification is known to target oncogenes and 
genes involved in dnigreastance. Frequency ofgene amplification as wdi as gene 
copy number increase during breast cancer progression, notably in patiCTts who 
do not re^ond to treatment, suggesting that overe9q>resdon of the amplified 
target genes confers a sdectwe advantage to malignant cdls (E^nn, U. et al. , ImL 
J. Cancer 5«:40.45 (1994); Guan, X.Y. efal., Nat Genet *:155-161 (1994)). 
In vivo^ the fi3ur MLN genes showed amplification in 10-20% of breast 
carcinomas tested. 

The D52 gene has been isolated by differential screening of a cDNA library 
fix>m primary infiltrating ductal breast cardnoma (Byrne, J. A et aL, Cancer Res. 
55:2896-2903 (1995)) and found to be overexpressed and localized exclusively 
to cancer cells, and not to other cell types such as fibroblastic cdls. By in situ 
hybridization of metaphase cells, D52 was localized to chromosome 8q21. This 
R^onof the huinan genome has been noted to be amplified in breast cancer ceD 
lines, and it was suggested that tiiefi:equent gain of tiie entire diromosome 8q arm 
in breast cardnomas may indicate the existence of several in^>ortant loci ii^dthin 
this region (Kallioiuemi, A. et aL, Proc. Natl Acad Set USA 97:2156-2160 
(1994)). 

The present inventors have isolated a homolog of D52 by screening a 
cDNA library fi-om prunary infiltrating ductal breast carcinoma with an expressed 
sequence tag (EST) that was identified to be homologous to the hD52 grae, 
followed by a secondary screening of the resulting positive clones. The method 
for dordng the D52 homolog is explained in detail in Example 5 below. One clone 
(D53) was isolated by the present inventors that encodes a protdn sharing 52% 
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identity to the D52 protdn. By in situ hybridization of metaphase cdls, the new 

gene of die present invention- was detennined to be localized to the 
r^on of diromosome 6. 

The present inventors have also isolated a murine homolog of the hD52 
gene fiom an apoptotic mouse mammaiy gland cDNA library by screemng with 
a fiagment (contaimng 91 bp of SUTR and 491 bp of codii^ sequence) of the 
hDS2gene. The meOod for domng the murine (m)D52 is eqilamed in detail m 
Exanqile S bdow. Hie mD52 done encodes a 185 amino adds protem sharing 
82%hamdpgywithhD52. By OT5ztefaybri(fizationofnurine metaphase cdl^ the 
mDS2 goie of tiie present invention was determmed to be localized to 
diromosoine 3A1-3A2, as wdl as diromosome 8C. 

AttAT 50, 51, 62 and 64 as Breast Cancer Prognosticators 

The four MLN genes of the present invoition «icode polypeptides which 
are useful as prognostic maricers for breast cancer. It is known in the art Aat 
prognostic markers provide important information m the nMMgimw«t of breast 
cancer patients (Elias et aL. J. Histotechnol lS(4):315-320 (1992)). For 
exanqile, for ^plication of ^stemic adjuvant therapy in ]»imary breast cancer, 
identification of lug|i- and low-iisk patients is a nugor issue (NfcGuire^ WX., N. 
EngL J. Med 320:525-527 (1989)). Several classical (tumor size^ lymph node 
status, Ustopathdogy, steroid recqrtor status) and seomd-generation prognostic 
fictors (pfolifiration rat^ DNA ploidy, oncogenes, growth fector lecqitors and 
some glycofTOtdns) are cunently available for maldiig thwapeutic dedaons 
^cGuir^ Wi., Prognostic Factors for Recurrence and Survival, in 
Educational Booklet American Society of Clinical Oncology, 25th 
Annual Meeting, 89-92 (1989); Contesso etal., Eur. J. Clin. Oncol 25;403-409 
(1989)). Alth(»igh no group of the art-known prognosticators completely fulfills 
tiie objective to fiilly distinguish high- and low-risk patients, combinations of the 
prognostic &ctors can improve the pre^ction of a patient's prognoas (McGuiie, 
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W.L., N, EngL J. Med 320:525-527 (1989)). Thus, by the invention, further 
prognostic niaricoi^are^dvided which canTbeaddi^ to the population of art- — 
known prognosticators to more particulariy distinguish between high* and low- 
risk breast canc^ patiots. 

The present inventors have discovered that, in many instances, cells 
obtained from breast tumors contain significantly greater copy number of at least 
one of the fiiur MLN genes and express significandy enhanced levels of MLN 50, 
51, 62 or 64 niRNA and/or protdn when compared to cells obtained from 
"normal" breast tissue, i.e., non-tumorigenic breast tissue. Thus, the invention 
provides a mediod usefid during breast cancer progno^ vMch involves assaying 
a first MLN 50, 5 1 , 62 or 64 gene expression level or gene copy number in breast 
tissue and comparing the gene expression level or gene copy numb^ with a 
second MLN 50, 51, 62 or 64 gene expression level or gene copy number, 
v/hsrdtjy the rdative levels of said first gene expression level or gene copy number 
over said second is a prognostic marker for breast cancer. 

The present inventors have not observed any unamplified tumor 
overexpression of the MLN 50, 51, 62 or 64 genes. Thus, vAUo the inventors do 
not intend to be bwnd by theory, it spears that the four MLN gmes could not 
be activated by medianisms other than gene amplification in breast carcinoma such 
as, fiir example; by alteration of regulatory sequences of the genes. Accordingjty, 
by the invention, gene amplification and enhanced gem expression over the 
standard is clinically relevant for breast cancer progno^s as indq)endent studies 
have diown an assodation between die presence of amplification and an increased 
risk of relapse (Slamon et aL. Sdence 235:177 (1987); Ravdin & Chamness, Gene 
/59;19(1995)). 

The methods of the invention can be used alone or together with other 
markers known in the art for breast cancer prognosis, including those discussed 
above. By "assaying MLN 50, 51, 62 or 64 gene expression lever is intofided 
qualitativdy or quantitatively measuring or estimating the MLN 50, 5 1, 62 or 64 
protein level or MLN 50, 51, 62 or 64 mRNA level in a first biological sample 



wo 97/06256 



PCTAJS96/12500 



-20- 

either directly or rdatively by comparing to the MLN 50, 51, 62 or 64 protein 
leN^orn*NAlevdfaasecondbiolo^calsanipie-B^ 

or 64 gene copy number" i$ intoided qualitatively or quantitatively measuring or 
estimating MLN 50, 5 1, 62 or 64 gene copy number in a first biological sample 
other direct^ or rdatively by comparing to the MLN S 0, S 1 » 62 or 64 gene copy 
number in a second biological sample. 

Probably, the MLN S0» 5 1 , 62 or 64 protdn levd, mRNA levd, or gene 
\^ oc^mimber in the first Indcgicalsanqde is measured or estn^ 

to a second standard MLN 50, 51, 62 or 64 protdn levd, mRNA levd, or gene 
copy nu^hber, the standard bdqg taken fit>m a second biological sample obtained 
^ fit>m an individual not having breast cancer. As ivill be appreciated in the art, 
once a standard MLN SO, 51, 62 or 64 protdn levd, mRNA levd, or gene copy 
number is known. It can be used rq>eatedly as a standard fi>r comparison. Itwill 
also be appreciated in the art, however, that the first and second biological 
samples can both be obtained firom individuals having breast cancer. In such a 
scenario, the rdative MLN 50, 51, 62 or 64 protdn levds, mRNA levds or gene 
copy numbos wiU provide a relative prognosis between the individuals. 

By "biological sample** is intended any biological sample obtained fix)m 
an individual, ceD line, tissue culture, or other source which contains MLN 50, 5 1 , 
62 or 64 protein; MLN 50, 51, 62 or 64 mRNA; or the MLN 50, 51, 62 or 64 
goie. Preferably, the biological sample includes tumorigenic or non-tumorigraic 
breast tissue. Methods for obtaining tissue biopsies are well known in the art 

The present invendon is usefiil as a prognostic indicator fi>r breast cancer 
in mammals. Preferred mamnials indude monkeys^ q>es^ cats, dogs, cows, pigs, 
horses, rabbits and humans. Particularly preferred are humans. 

Assaying MLN 50, 51, 62 or 64 gene copy number can occur according 
to any known tedmique such as, for example, by visualiang extrachromosomal 
double minutes (dmin) or intc^grated homogeneously staining regions (hsrs) 
(GdAart etaL, Breast Cancer Bes. Treat 8:125 (1986); DutriDaux et al.. Cancer 
Genet Cytogenet 49:203 (1990)). Other techniques such as comparative 
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genomic hybridtzation (CGH) and a strategy based on diromosome 
microdissection and fluorescence j/i 5i7^ hybrid 

for regions of ino-eased DNA copy number in tumor cdls (Guan et oL, Nature 
Genet 8:155 (1994);Muiemetcd,, Genes Chn^ DNA 
probes that Iq^dize to the four MLN genes can be prepared as described bdow. 

Total cellular RNA can be isolated from normal and tumorigemc breast 
tissue using any suitable tedmique such as the single-stq> guaiudinium- 
thiocyanate-phenol-dilorofonn method described in Chomc^nski and Sacdu, 
AnciBiochenL 162:156-159 imi). The UCl^rea method described in Aufifray 
andRougeon, Eur. J. Biochem. 107:303 (1980) can also be used. MLN SO, 51, 
62 or 64 mKNAlevds are then assayed u^g any iq>propriateni^^ These 
include Northern blot analyas^ SI nuclease mapping, the polymerase diain 
reaction (PGR), revme transcription in combination vnA the polymmse cham 
reacdcm (RT-PCRX ^ reverse transcription in combination with the ligase chain 
reaction (RT-LCR). 

Northern blot analy^ can be perfonned as desoibed in Hanidae/^^ Cell 
63:303-3 12 (1990). Briefly, total KNA is prepared from a biological sample as 
desCTibed above. For the Northon blot, the KNA is denatured in an appropriate 
bufifer (sudi as glyoxal/dimethyl, sulfoxide/sodium phosphate buflfer), subjected 
to agarose gsl dectrophoresis, and transferred onto a nitrocellulose or nylon filter. 
MLN SO, SI, 62 or 64 DNA labeled according to any appropnate method (sudi 
as the ^-^nultqirinaed DNA labefing system (Amersham)) is After 
hybridization, the fiher is washed and exposed to x-ray fibn. 

MLN SO, 5 1 , 62 or 64 DNA for use as probes according to the present 
invention are described bdow. Where a fragment is used, the DNA probe vnll be 
at least about lS-30 nucleotides m Iragth, and preferably, at least about SO 
nucleotides in length. 

SI mapping can be periTormed as described in Fujita ei dL, Cell 49:351- 
367 (1987). To prepare probe DNA for use in SI mapping, the sense strand of 
MLN 50, 51, 62 or 64 cDNA is used as a template to synthesize labeled antisense 
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DNA. The antis^ise DNA can then be digested using an appropriate restriction 
"mdonua^seto^^ialtef^^ — 
prc*es are useful for visualizing protected bands co]Teq)onding to MLN 50, 51, 
62 or 64 mRNA. Northern blot analysis can be performed as described above. 

Ahematively, N4LN 5 0, 5 1 , 62 or 64 mRNA levds are assayed ustiig the 
RT-PCR m^od described in Maldno et al. Technique 2:295-301 (1990). By 
this method, the radioactivities of the amplification products in the polyaajdamide 
gel bands are lineariy rdated to the initial concratration of the taiget mRNA. 
Briefly, tins mefliod involves addmg total RNA isolated firom a K 
in a reaction mixture contaiiung a RT primer and appropriate buffer. After 
incubating for primer annealing, the mbcture can be supplemoited with a RT 
buflfa,dNTP$,DTrr,RNaseinMbitorandreva^transcrip After incubation 
to adrieve reverse transcription of the RNA, the RT products are then subject to 
PGR uang labded primers. Alternatively, rather than labeling the primers, a 
labeled dNTP can be included in the PGR rraction mixture. PGR amplification 
can be performed in a DNA thermal cycler according to convrational tedmiques. 
After a suitable number of rounds to achieve amplification, the PGR reaction 
mixture is dectrophoresed on a polyacrylanrude gd. A&er diying the gd, tiie 
radioactivity of the ^propriate bands (corresponding to the MLN 50, 51, 62 or 
64 mRNA) is quantified using an ima^g ana^^. RT and PGR reaction 
h^rediems and conditions, reagent and gd concentrations, and labding methods 
arewdlknownintiieart Variations on the RT-PGR method wiD be apparent to 
the skilled artisan. 

Any set of ofigwudeotide primers v^ch will ampfify reverse transoibed 

MLN 50, 5 1, 62 or 64 niRNA can be used and can be designed by reference to 
MLN 50, 51, 62 or 64 DNA sequence provided below. 

Ass^g MLN 50, 5 1, 62 or 64 protein levels in a biological sample can 
occur using any art4mownmetiK)d. Prefeired are antibody-based techniques. For 
example, MLN 50, 51, 62 or 64 protdn expression in tissues can be studied witii 
classical immunohistological metiiods. In these, tfie specific recognition is 
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provided by the primary antibody (polyclonal or monoclonal) but the secondary 
liet^o^nl^l^can utilize fluorescentrenzymer^^ 

antibodies. As a result, an immunohistological staining of tissue section for 
pathological e»mination is obtained. Tissues can also be extracted, e.g., with 
urea and neutral deteigent, for the liberation of MLN SO, 51, 62 or 64 protdn for 
Western-blot or dot/slot assay (Jalkanen, M., ei aL, J. Cell BioL 707. 976-985 
(1985); Jalkanen, M., et dL, J. CeU . BioL 705:3087-3096 (1987)). In this 
technique, vAidi is based on the use of cadonic solid phases, quantitation of MLN 
50, 51, 62 or 64 protdn can be accomplished u^g isolated MLN 50, 51, 62 or 
64 as a standard. This technique can also be applied to body fluids. With these 
samples, a molar concoitration of MLN 50, 51, 62 or 64 protein wOi aid to set 
standard vahies of MLN 50, 51, 62 or 64 protein content for different body fluids, 
like serum, plasma, urine, spinal fluid, etc. The normal appearance of MLN 50, 
51, 62 or 64 amounts can then be set using values Srom healthy individuals, which 
can be compared to those obtained from a test subject. 

Other antibocfy-based methods useful for detecting MLN 50, 5 1, 62 or 64 
gene esqnession include immunoassstys, sudi as the enzyme linked immunosorbent 
assay QELISA) and the radioimmunoassay (RIA). For example, a monoclonal 
antibody can be used both as an immunoabsorboit and as an enzyme-labeled probe 
to detect and quantify the MLN 50, 51, 62 or 64 protein. The amount of MLN 
50, 51, 62 or 64 protein present in the sample can be calculated by reference to the 
amount present in a standard preparation using a linear regres^on computer 
algorithm Such an ELESA for detecting a tumor antigen is described in lacobelli 
et aL, Breast Cancer Research and Treatment 77:19-30 (1988). In anoth^ 
EUSA assay, two distmct monoclonal antibodies can be used to detect MLN 50, 
5 1 , 62 or 64 protein in a body fluid. In this assay, one of the antibocUes is used as 
the inrnmnoabsofbent and the other as the en^me-labeled probe. 

The above techniques may be conducted essentially as a *'one-step" or 
"two-step" assay. The •^one-step" assay involves contacting MLN 50, 51, 62 or 
64 protein with immobilized antibody and, without washing, contacting the 
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mixture with the labeled antibody. The "two-step" assay involves washing before 
—contacting the mbdure with the labded antibody^ 
also be employed as suitable. It is usually desirable to immobilize one component 
of the assay system on a support, thereby allowing other components of the 
^em to be brought into contact with the component and readily removed torn 
the sample. 

Suitable enzyme labds include, fi>r example, those fiom the oxidase group, 
which cata^ the production of hydrogen peroxide by reacting with substrate. 
CSucose oxidase is particubriy prefmed as it has good stabi% 
(glucose) is readily acvailable. Activity of an oxidase labd may be assayed by 
measuring the concentratian of hydrogen peroxide formed by the enzymeJabded 
antibody/substrate reactioa Besides enzymes, other suitable labels include 
radbisotopes, such as iodme n carbon C^C), salphee C'S), tritium CH), 
indium C^X and technetium (^c), and fluorescent labels, such as fiuorescdn 
and rfaodanune, and biotin. 

In addition to ass^g MLN 50, 5 1, 62 or 64 protein levels in a biological 
sample obtained from an individual, MLN 50, 51, 62 or 64 protein can also be 
detected mWvo by imaging. Antibody labels or markers for m wo imaging of 
MLN 50, 5 1, 62 or 64 protdn include those detectable by X-radiograpfay, NMR 
or ESR. For X-radiograpl^, suitable labels inchide radioisotopes such as barium 
or caesium, which emit detectable radiation but are not oveitly harmfid to the 
subject. Suitable markers for NMR and ESR include those with a detectable 
characteristic spin, such as deuterium, which may be incorporated mto the 
antibody by labeluig of nutrients for the relevant faybridoma. 

An antibody or antibody fragment wUdi has been labded witii an 
Bppropnate detectable una^g moiety, such as a radioisotope (for example, ^% 
*"In, ^^c), a radio-opaque substance, or a material detectable by nuclear 
m^etic resonance, is introduced (for example, parmterally, subcutaneoudy or 
mtr^eritoneally) into the mammal to be examined for breast cancer. It will be 
understood in the art that the size of the subjea and the irxiaging systmi used wU^ 
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determine the quantity of imag^g moiety needed to produce diagnostic images. 

^In-the case of-a-radioisotope-moiety^-for a-human subject;-the quantity of- 

ladioacdvity iiyected wiU nonnally range from about 5 to 20 millicuries of '^c. 
The labeled antibody or antibody fragment will then preferentially accumulate at 
S the location ofodlsi^di contain the protein, /ii i^ tumor imaging is described 

in S.W. Burdud ei aL, Inanunophcamacotinetics of RadiolabeUed Antibodies 
and Their Frag^mntSyinTU^JBJ^^^ THERADloCHEl^CALDETEcncJNOF 
\^ Cancer (S.W. Burdud and B. A Rhodes, eds., Masson Publidnng Inc. (1982)). 
\ Antibodies fi»r use in the present invention can be raised against the intact 

10 MLN $0,51^62 or 64 protein or an antigenic pofypeptidefragmmt thereof w 

may presented together with a carrier protein, such as an albumin, to an ammal 
system (such as rabbit or mouse) or, if it is long enough (at least about 25 amino 
adds), without a carrier. As used heran, the term "antibody" (Ab) or 
"monodonal antibody" (Mab) is meant to include intact molecules as well as 
15 antibody fi^gmaits (such as, for example. Fab and F(ab')2 fragments) ^ch are 

capable of spedfically binding to the MLN 50, 51, 62 or 64 protdn. Fab and 
F(ab')2 fragments lade tiie Fc fragment of intact antibody, clear more rapidly from 
the diGulation, and m^ have less non-specific tissue binding of an intact antibod^y 
(Wahl et oL, J. NucL Med 2^:316-325 (1983)). Thus, these fragments are 
20 prefined. 

The antibodies of the present invention may be prepared by any of a 
variety of methods. For example, cdls expressing the MLN 50, 51, 62 or 64 
protdn or an antigenic fragment Aeteof can be administered to an animal in order 
to hiduoedie production ofsera containing polyclonal antibodies. &i a preferred 
25 metibod, apreparation of MLN 50, 51, 62 or 64 is prepared and purified to render 

it substantial^ fiw of natural contaminants. Sudi a preparation is then introduced 
into an animal in order to produce polyclonal antisera of greater spedfic activity. 
In the most preferred m^od, the antibodies of the present invention are 
\ monoclonal antibodies (or MLN 50, 51, 62 or 64-binding firagments thereof). 

30 Sudi monodonal antibodies can be prepared using hybridoma technology (Kohler 
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et oL. Nature 255:495 (1975); Kohler et aL, Eur. J. ImnamoL 6:5\\ (1976); 
KaMsretaL. Eur. J. ImmmoL 5:292 (1976); Hanuneriing et aL, MdNOCUWiAL 
Antibqdies and T-Cell Hybridqmas, 56^681 ^sevier, N.Y.,l981)). 
general, aich procedures invoh« iiimiuiiia^ 

a MLN 50, 51, 62 w 64 andgen or, more preferably, with a cell oqiressing the 
antigEa Sint^lecelk can be recognized by tteircapadty to bind anti-MLN 50, 
51, 62 or 64 antibo^. Such cdls may be cultured in any suitable tissue culture 
medium; however, it is preferable to culture ceUs in Earie's modified Eagle's 
medhmi siq)plemaited with 10% fetal bovine seium Onactivated at about 56°C), 
and supplemoited vwith about 10 ^ig/l of nonessential amino adds, about 1,000 
U/nd of pemdllin, and about 100 fig/ml of streptomydn. The splenocytes of such 
mice are extracted and fused with a suitable mydoma cdl line. Any suitable 
myeloma cdl line may be employed in accordance with tiie present invention; 
however, it is preferable to employ the parent mydoma cdl line (SPPX available 
fiom the American Type Culture CoUection,Rock:vine,Ma^^ After fusion, 
the resutting hjfbridoma cdls are sdectivdy mamtained in HAT medhim, and then 
cloned by finutii^ iShition as described by Wands e/oiL, Gzs»ve^^ 
232 (1981). The hybridoma cdls obtained tiirough sucii a sdection are then 
assayed to identify dooes which secrete antibodies capable of binding the MLN 
50, 51, 62 or 64 antigen. 

It win be appreciated that Fab and V{a>\ and other fiagments of the 
antibodies of the present invention may be used according to the methods 
(fisdosed herem. Such fi:agments are typically produced by proteolytic cleavage, 
using enzymes such as papain (to produce Fab firagments) or pepsin (to produce 
F(ab')2 fiagments). Alternative^, antigen bindmg fiagments can be produced 
through tiie application of recombinant DNA technology or through symhetic 
chemistry. 

Where in vivo imaging is used to detect levels of MLN 50, 51, 62 or 64 
protein in humans, it may be preferable to use "humanized" chimeric monodonal 
antibodies. Such antibodies can be produced using genetic constracts derived 
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from hybridoma cells producing the monoclonal antibodies described above. 

^Methods for piodudng chimeric antibodies are^ta^ &erMorrison, 

Science 229:1202 (1985); Oi ei aL, BioTechniques 4:2\A (1986); Cabilly et aL, 
U.S. Patent No. 4,816,567; Taniguchi et oL, EP 171496; Morrison et al, EP 
5 173494; Neubeiger et aL. WO 8601533; RoMnson et aL, WO 8702671; 

Bouliamie et dL, Nature 312:643 (1984); Neubeiger et oL, Nature 314:26% 
(1985). 

D52/D53 Gene E^resdon as a Marker to Distinguish Differed 
Types of Leukemia 

The present inventors have fiuther discovered that the rdative expresaon 
levels of the D52 and D53 genes can be used to distinguish between dififeient 
types of leukemia. Li pardcular, the inventors have observed that the D52 gene 
is e9q>ressed in leukemia cdls that have n^docytic characteristics (sudi as HL-60 
cells) but not in leukemia cdls having erythroid characteristics (such as K 562 
cdls); ^ereas the inverse is true for D53 gene expression. Thus, the invention 
fiirther provides a diagnostic method for distinguishmg betweoi diflferoit ^es of 
leukemia, which invcdves assaying leukenua cdls for D52 or D53 gene expression; 
iniiereby, the presence ofD52 gene expresson or the lack of D53 gene expression 
indicates fliat the leukemia cells have mydocytic characteristics and the presence 
of D53 gene expression or the lack of D52 gene expression indicates that the 
leukemia cells have erythroid diaracteristics. Preferably, the method is used to 
distinguish dififo^nt types of acute mydoid leukemia. As indicated, the method 
of the invention can be performed by assaying for the presence or absence of 
dther D52 or D53 gene expression. However, preferably, the expres^on of both 
genes is assayed. 

The human (h) D52 gene is described in detail in Byrne, J.A., et al. 
Cancer Research 55:2896-2903 (1995) and the mD52 gene is described below. 
The hD53 gene is dso described below. Methods for detecting D52 and D53 
gene expression in Iwkenua cells are described in detail above and in the 
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Examples below. As above, D52 and D53 gene expression can be assayed by 
detecting dther the con^ondii^n^^ 

MLN SO, Sly 62, 64 and DS3 Nucleic Acid Molecules, 
Polypeptides and Fragments Thereof 

Using the information provided herein, such as the nudeotide sequences 
ofMLN 62, 50, 64, 51, D53, or niD52 as set out in Figures 6, 14, 16, 21(A-D) 
24(B) and 25(B), respectively (SEQ ID N0S:1, 3, 5, 7, 9 and 1 1, respectwdy), 
ansohted nuddc add molecule of the present uivention may be obtained using 
standard donmg and screening procedures^ sudi as those for cloiungcDNAsudng 
miRNA as starting matoial. 

By Isolated*" nuddc add nK>IecuIes(s) is intended a nucldc add molecule, 
DNA or RNA, which has been removed from its native environment. For 
example, recombinant DNA molecules contained in a vector are conadered 
isolated for purposes of the invention as are recombinant DNA molecules 
maintained in heterologous host cells or purified (partially or substantially) DNA 
molecules in solution. Isolated SNA molecules indudemvi^a RNA transcripts 
of the DNA molecules of the present invention. By "isolated" polypeptide or 
protein is intended a polypqitide or protdn removed from its native environment. 
For example, reoombinamtly produced polypqitides and protdns expressed in host 
cdls are conddered isolated for purposes of the invention, as are native or 
recombmant polypqitides wUch have been partially or substantial^ purified by 
any suitable technique sudi as, for example, the smgle-step purification method 
disdosed in Snutfi and Johnson, Gene 67:31-40 (1988). Isolated nucldc add 
molecules and polypeptides also include such compounds produced syntheticaUy. 

As indicated, nuddc add molecules of the present invention may be in the 
form of RNA, such as mRNA, or in the form of DNA, including, for instance, 
cDNA and genomic DNA obtained by cloning or produced synthetically. The 
DNA may be double- or sin^e-stranded. Single-stranded DNA may be the coding 
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strand, also known as the sense strand, or it may be the noncoding strand, also 

r^rred to as the antisense strand; — — 

The MLN 50, 51, 62, 64 genes and the D53 gene were deported on June 
14, 1996, at the Amwican Type Culture Collection, 12301 Park Lawn Drive, 
Rodcville, Maiyland 20852 and given the accession numbers indicated herein. 

The MLN 50, 51, 62, 64, D53 and niDSl nucleic add molecules of the 
present invention are discussed in more detail bdow. 

MLN62 

The present invention provides isolated nuddc add molecules compri^g 
a polynucleotide encoding the CARTl polypeptide (corresponding to the MLN 
62 d)NA done) whose amino add sequence is shown Figure 6 (SEQ ID N0:2) 
or a fragment of the polypeptide. Such isolated nudeic add molecules include 
DNA molecules compriang an open reading frame (ORF) whose initiation codon 
is at po^on 85-87 of the nudeotide sequence shown in Figure 6 (SEQ ID NO: 1) 
and fiuther indude DNA molecules wMch comprise a sequence substantially 
diflferait than aD or part of the OSF vAtose initiation codon is at portion 85-87 of 
the nudeotide sequence of I^gure 6 (SEQ ID N0:1) but which, due to the 
d^^neracyofthe genetic code; still encode the CARTl polypqjtide or a fragment 
thereof Qfcourse; the genetic code is well known in the art. Thus, it would be 
routine for one sidlled in the art to generate the degenerate variants described 
above. 

The invention fiirther provides isolated nuddc add molecules encotUng 
the CARTl polypeptide ha^g an anuno add sequence as encoded by the cDNA 
of the done deposited as ATCC Deport No. 97610 on June 14, 1996. 

Hie invention fintfaer provides an isolated nuddc add molecule having the 
nucleotide sequence shown in Figure 6 (SEQ ID N0:1) or the nucleotide 
sequence of the CARTl gene contained in the above-described deposited cDNA, 
or a fragment thereof. Such isolated DNA molecules and fragments thereof are 
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useful as DNA probes for gene mapping by in situ hybridization with 
cbroniosonies and for detecting expression of 

Onchiding breast and lymph node tissues) by Northern blot analysis. Of course, 
as discussed above, if a DNA molecule includes the ORF whose initiation codon 
is at position 85-87 of Figure 6 (SEQ ID N0:1X then it is also useful for 
e}^iesang the CARTl polypeptide or a fiagment thereof. 

MLN50 

The presoit invention also pro^ades isolated nuddc add molecules 
conqprismg a polynudeotide encoding the Lasp-1 polypeptide (corresponding to 
the MLN SO cDNA clone) whose amino acid sequence is shown in Figure 14 
(SEQ ID NO:4) or a fragment of the polypeptide. Such isolated nuddc add 
molecules include DNA molecules compri^g an open reading frame (ORF) 
whose initiation codon is at podtion 76-78 of the nudeotide sequence of Figure 
14 (SEQ ID NO:3) and fiirthw- include DNA molecules which comprise a 
sequence substantially diflFerent than all or part of the ORF \rfiose initiation codon 
is at position 76-78 of the nucleotide sequence of Figure 14 (SEQ ID N0:3) but 
whidi, due to the degeneracy of the g^etic code, still encode the Lasp-1 
po^^pqrtide. Ofoourse, the genetic code is well known in the art. Thus^ it would 
be routine fi>r one skilled m the art to generate the degenerate variants described 
above. 

The invention fiirther pro^des isolated nucldc add molecules encoding 
theLa^l polypqytide havmg an anuno add sequmce as encoded the cDNA 
of the done deposited as ATCC Deport No. 97608 on June 14, 1996. 

The invention further provides an isolated nudeic add molecule having the 
nucleotide sequence shown in Figure 14 (SEQ ID NO:3) or the nucleotide 
sequence of the Lasp-1 gene contained in the above-described deposited cDNA, 
or a fragment thereof Such isolated DNA molecules and fragments thereof are 
useful as DNA probes for gene mapping by in situ hybridization with 
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chromosomes and for detecting expres^on of the Lasp-1 gene in human tissues 
Occluding breast and lymph node tissues) by Northern^ 
as discussed above, if a DNA molecule indudes the ORF whose initiation codon 
is at position 76-78 of Figure 14 (SEQ ID N0:3X then it is also useful for 
expres^g tiie Lasp-1 polypeptide or a fragment thereof. 

MLN64 

The presoit invention also pro^des isolated nudetc add molecules 
comprising a polynucleotide encoding the MLN 64 polypeptide whose amino add 
sequence is shown Kgure 16 (SEQ ID N0:6) or a fragment of the polypeptide. 
Sudi isolated nuddc add molecules mclude DNA molecules compri^g an open 
reading fi^me (ORF) whose initiation codon is at position 169-171 of the 
nucleotide sequence of Figure 16 (SEQ ID NO:5) and further indude DNA 
molecules i^ch conq>rise a sequence substantially different than all or part of the 
OKF v^ose initiation codon is at position 169-171 of the nudeotide sequence of 
Figure 16 (SEQ ID NO:5) but which, due to the degraeracy of the genetic code, 
still encode the MLN 64 polypeptide or a fragment thereof Of course, the genetic 
code is wdl known in the art. Thus, it would be routme fr>r one skilled in the art 
to generate the degenerate DNA molecules above. 

The invention fiirther provides isolated nuddc add molecules enco^g 
the MLN 64 polypeptide having an anuno add sequence as encoded by the cDNA 
of the done dqpo^ted as ATCC Deposit No. 97609 on June 1 4, 1996. 

The invention fiirther provides an isolated DNA molecule having the 
nucleotide sequence shown in Figure 16 (SEQ ID NO:5) or the nucleotide 
sequence of the MLN 64 gene contained in the above-described deposited cDNA, 
or a fiagment thereof Such isolated DNA molecules and firagments thereof are 
usefiil as DNA probes for gene mapping by in situ hybridization with 
chromosomes and for detecting expres^on of the MLN 64 gene in human tissues 
Onchidmg breast and lymph node tissues) by Northern blot analysis. Of course. 
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as discussed above, if a DNA molecule includes the ORF whose initiation codon 
-is at position 169-171 of Rgure 16 (SEQ ID NO:5)rthe 
e»pressing the MLN 64 polypeptide or a fiiagment thereof 

MLN51 

The present invention also provides isolated nucleic add molecules 
coinpriaiigapolynudeotideencodingtheMLNSl polypq)tide ivhose amino add 
sequence is shown Figure 21(A-D) (SEQ ID NO:8) or a firagment thereof Sudi 
isolated middc add molecules indude DNA molecules compriang an open 
reading frame (ORF) \i^se initiation codon is at po^on 234-236 of the 
nucleotide sequence of Figure 21(A-D) (SEQ ID NO:7) and fiirthw include DNA 
molecules yitixii conqsise a sequoice substantially different than all or part of the 
ORF whose imtiation codon is at position 234-236 of the nucleotide sequence of 
Figure 21(A-D) (SEQ ID NO:7) but which, due to the degeneracy of the genetic 
code, still encode the MLN 51 polypeptide or a fragment thereof Of course, the 
genetic code is wdl known in the art. Thus, it would be routine for one skilled in 
the art to graierate the d^enerate DNA molecules above. 

The invention fuith^ provides isolated nuddc add molecules ^icoding 
the MLN 5 1 polypqitide having an amino add sequmce as encoded by the cDNA 
of the clone deposited as ATCC Deposit No. 9761 1 on June 14, 1996, 

The invention fiirther provides an isolated DNA molecule having the 
mideodde sequence shown in Figure 21(A-D) (SEQ ID NO:7) or the nucleotide 
sequence oftheMLN SI ^e contdned in the above-desoibed deposited cDNA, 
or a fragmrat thereof Such isolated DNA molecules and fragments thereof are 
useful as DNA probes for gene mq>ping by in situ hybridization with 
chromosomes and for detecting expression of the MLN 5 1 gene in human tissues 
^ducfing breast and lymph node tissues) by Northern blot analysis. Of course, 
as discussed above, if a DNA molecule includes the ORF whose initiation codon 
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is at position 234-236 of Rgure 21(A-D) (SEQ ID N0:7), thai it is also usefiil for 
expr^angtKe MLN 5 l^^^tideWa^finE^ent thereof. 

D53 

The present invention also provides isolated nucleic add molecules 
comprising a polynucleotide racodiqg die D53 potypqitide whose amino add 
sequence is shown Figure 24(B) (SEQ ID NO:10) or a fiagmentihareof Sudi 
isolated nuddc add molecules indude DNA molecules compiiang an open 
readmg frame (ORF) whose iiutiation codon is at po^on 181-183 of the 
nucleotide sequmce of Figure 24(B) (SEQ ID N0:9) and fiuther indude DNA 
molecules vdudi cQnq)iise a sequence substantially dififmnt than all or part of the 
ORF^^ose imtiation codon is at po^on 181-183 of the nucleotide sequence of 
Figure 24(B) (SEQ ID N0:9) but which, due to the degencraqr of the gaietic 
code, still encode the DS3 polypeptide or a fragment th^ieof Of course, the 
genetic code is well known in the art. Thus, it would be routine for one skilled in 
the art to goierate the degenerate DNA molecules above. 

The invention fiuther provides isolated nuddc add molecules encoding 
the DS3 polypeptide having an amino add sequence as encoded by the cDNA of 
the done dqiodted as ATCC Deposit No. 97607 on June 14, 1996. 

The uivention fiurther provides an isolated DNA molecule lumng the 
nucleotide sequence shown in Figure 24^) (SEQ ID NO:9) or tiie nudeotide 
sequence of the DS3 gene contained in the above-described dqK>sited cDNA, or 
a fragment thmof. Such isolated DNA molecules and fragments thereof are 
usefiil as DNA probes fiar gene mapping by m situ Ii^ridization with 
chromosomes and for detecting expres»on of the DS3 gene in human tissue 
(induding breast and lymph node tissues) by Northern blot analysis. Of course, 
as discussed above, if a DNA molecule includes the ORF whose imtiation codon 
is at portion 181-183 of Figure 24(B) (SEQ ID NO:9), then it is also usefiil for 
expressing the DS3 polypeptide or a fi^gment thereof 
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MurineD52 



The presCTt invention also provides isolated nucldc add molecules 
conqsrisiqg a polynucleotide encoding the murine D52 polypeptide whose anuno 
add sequence is shown Figure 25(B) (SEQ ID NO: 12) or a fiagment thereof. 
Such isolated nudeic add molecules indude DNA molecules compriang an open 
readng frame (QRF) vtese imtiation codon is at position 22-24 of the nudeotide 
sequence of Figure 2S(B) (SEQ ID NO: 11) and further indude DNA molecules 
whidi comprise a sequence substantially (fififerent than all or part of the ORF 
vAose imtiatton codon is at poation 22-24 of the nudeotide sequmce of Figure 
2SCB) (SEQ ID NO:l 1) but whidi, due to the degeneracy of the genetic code, still 
encode the DS2 polypeptides a fragment thereof Ofcourse» the genetic code 
is wdl known m the art. Thus, it would be routine for one skilled in the ait to 
generate the degenerate DNA molecules above. 

The invration frirther pro^des an isolated DNA molecule having the 
nucleotide sequrace ^own in Figure 25(B) (SEQ ID N0:11) or a fragment 
dneoC Sudi isolated DNA molecules and fragments thereof are usefiil as DNA 
probes for gene mapping by in situ hybridization with diromosomes and fi>r 
detecting esqnesston of the murine or human DS2 gene in mouse or hmnan tissue 
Ondudiqg breast and fymph node tissues) by Northern blot analy^. Of course^ 
as cfiscussed above, if a DNA molecule indudes the ORF whose initiation codon 
is at position 22-24 of Figure 25(B) (SEQ ID NO: 1 1), then it is also us^ for 
e3q>res»i|g the murine D52 pofypeptide or a fragment thereof 

Fragmenis, DeriwOhes and Varianis of OteJsolaied Nudeic Add MaleaOes 
of Ae Invention 

By "fragments" of an isolated DNA molecule having the nudeotide 
sequence shown in Figure 6, 14, 16, 21(A.DX 24(B), or 25 (B) (SEQ ID N0:1, 
3, 5, 7, 9, or 11, respectively) are intended DNA fragmrats at least 15 bp, 
preferably at least 20 bp, and more preferably at least 30 bp in length wMch are 
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usefiil as DNA probes as discussed above. Of course, larger DNA fragmmts of 
—about-50-2000 bp m length-are also usefiJ as DNA probes according to 
present invention as are DNA fragments corresponding to most, if not all, of the 
mideotide sequence shown in Figure 6, 14, 16, 21(A-D), 24(8), or 25(B) (SEQ 
ID NO: 1, 3, 5, 7» 9» or 1 1, respectively). By a fiagment at least 20 bp in length, 
for example, is intended fi-agmmts wMch include 20 or more contiguous bases 
fiom the mideotide sequence of the deposited cDNA or the nucleotide sequrace 
shown in Figure 6, 14, 16, 21(A-D), 24CB), or 25(B) (SEQ ID NO: 1, 3, 5, 7, 9, 
orll,reqpectivdyX As indicated, such fragments are usefiil diagnostically dther 
as a probe aocoiding to conventional DNA hybridization techniques or as pniasrs 
fi>r amplification of a target sequence by the polym^ase cfaun reaction (PGR). 

For exanq)le, the present inventors have constructed a labeled DNA probe 
corresponding to the full length human cDNA (nucleotides 1-2004) to detea 
CARTl gene expres^on in human tissue using Northern blot analy^s (see infra. 
Example 2). Further, the present invoitors have constructed a labeled DNA probe 
corresponding to a 1.0 kb BamHI fragment to detect Lasp-1 gene expres^on in 
human tissues uang Northern blot analysis (see infra. Example 3). The present 
inventors have also constructed a labded DNA probe corresponcfing to 
nucleotides 1 to 2008 of Figure 16 (SEQ ID NO:5) to detect MLN 64 gene 
expression in human tissues u^g Northern blot analy^ (see infra. Example 4). 
Still further, a 5' probe of MLN 64 was obt^ed u^g an amplified (by PGR) 
DNA fragment (nudeotides 1-81 of Rgure 16 (SEQ ID NO:5)), as was a 3' probe 
ooneqxmdiiig to snEcoBl fiagment (nucleotides 60-2073 of Rgure 16 (SEQ ID 
N0:5)). Finally, tiie present inventors have also labeled the 842 bp insert of done 
1 16783 (Fig. 1(A)) to isolate the Ul done (now DSS), as well as to detect DS3 
expresaon in human tissues usmg Northern blot analysis (see infra. Example 5). 

Since the MLN 62, 50, 64, 51 gmes and the D53 gene have been 
deposited and the nucleotide sequences shown in Figures 6, 14, 16, 21(A-D), 
24(B) and 25(B), respectively (SEQ ID N0:1, 3, 5, 7, 9, or 11, respectively) are 
provided, generating such DNA fiagments of the present invention would be' 
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routtne to the skiUed artisan. For example, restriction oidonuclease deavage or 

shearing by sonication codd eaaly be 

Altemativdy, the DNA firagments of the presait invention could be generated 
synthetically according to known techniques. 

Preferred nuddc add molecules of the present invention will encode the 
mature firnn of the MLN 62, 50, 64, 51, mD52 or P53 protdn and/or additional 
sequences, such as those encodmg the leader sequence, or the coding sequ«ce of 
\^ the mature polypqytide, with or without the aforementioned additional coding 
\ sequences, together with adcfidonal, noncoding sequences, mcluding for example; 
but no^ limited to intions and noncoding 5' and 3* sequmces sudi as the 
. transcribed, nontranslated sequmces that play a role in transcription, mRNA 
processiqg Onduding spfidng and polyadenylation dgnals), ribosome bindmg, and 
mRNA staWB^, and adcfitiond coding sequence 1^ 

adds, sudi as those i^ch provide additional functionalities. Thus, for instance, 
the polypeptide may be &sed to a maricer sequence, sudi as a peptide, vMch 
fidlitates purificatira of the fosed polypqrtide. In certain pref^ed embodiments 
of this aspect of the invention, the marker sequence is a hexa-histidine peptide, 
such as tiie tag provided in a pQE vector (Qiagen, Inc.), among otfiers, many of 
which are commaxaaDy available. As described in Gentze/ai,/^.Aralii4aai 
ScL USA 86: 821^ (1989), for example, hexa-histidme provides for convenient 
purification oftiiefiiaon protdn. The HA tag corresponds to an epitope daived 
of influenza hemagghitinm protdn, which has been described by Wilson etaL, Cell 
57:767(1984). 

Hie present invention finther rdates to variants of the isolated nuddc add 
molecules of the present invention, which encode fragments, analogs or 
dcrivativesoftiieMLN62,50,64,51,mD52orD53protein. Variants may occur 
naturally, such as an allelic variant. Non-naturally occurring variants may be 
produced using art-fau>wn mutageneas techniques, which include those produced 
by nudeotide substitutions, deletions or additions. Especially preferred among 
these are alent or conservative substitutions, additions and deletions, which do not 
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alter the propmies and activities of the MLN 62, 50, 64, 51, inD52 or D53 

- — protdn or fragment th»^£ ^ — ~~ 

Further embodiments of the invention include isolated nucldc acid 
molecules that are at least 90% identical, and more preferably at least 95%, 97%, 
98% or 99% identical to the above^escribed isolated nucleic add molecules of 
the present invention. In particular, the invendon is directed to isolated nudeic 
add molecules at least 90%, 95%, 97%, 98%, or 99% identical to the nucleotide 
sequences contained in the dq)osited cDNAs or in Figures 6, 14, 16, 21(A-D), 
24(B) or 25(B) (SEQ ID NO:l, 3, 5, 7, 9 or 1 1, respectivdy). 

By the invention, "% identic between two nuddc add sequences can be 
determined uang the "festA" axapater algoridun (Pearson, W.IL & lipman, DJ., 
Proa NaiL Acad Sci. USA 85:2444 (1988)) with the de&ult panunetets. Uses 
of such 95%, 97%, 98%, or 99% identical nucldc add molecules of tiie present 
invention indude, imer alia, (1) isolating the MLN 62, 50, 64, 5 1, mD52, hD52, 
orD53 gene or all^c variants thereof in a cDNA hT)raiy; (2) in situ hybridization 
(FISH) to metaphase chromosomal spreads to provide predse chromosomal 
location of the MLN 62, 50, 64, 51, mD52, hD52 or D53 gene as described in 
Venna ei aL, HUMAN CHROMOSOMES: A MANUAL OP BASIC TECHNIQUES 
(PCTgamon Press, NY, 1988); and (3) Northon Blot analysis for detecting MLN 
62, 50, 64, 51, mD52, hD52 or D53 mRNA expresaon in specific tissues. 

Guidance conconing how to make phoiotypicatty silent amino add 
substitutions is provided m Bowie, J.U. et oL, Science 2-/7;I306-1310 (1990), 
Mteein the authors in(ficate that there are two main approaches for studying the 
tol^ance of an amino add sequoice to diange. The first method rdies on the 
process of evohition, in i^ch mutations are etthar accepted or rejected by natural 
sdection. The second approach uses gmetic engineering to introduce amino add 
changes at qpedfic positions of a doned gene and sdections or soreens to identify 
sequences that maintam fimctionality. As the authors state, these stupes have 
revealed that protdns are surprisingly tolerant of amino add substitutions. The 
authors fiirth^ indicate which amino acid changes are likely to be permissive at 
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a cmtain portion of the protdn. For example, most buried amino add re^dues 

require nonpolar side diainsrwha*eas f^ 

generally conserved Other sudi phmotypically silent substitutions are described 
in Bowie, J.U., etaL, Science 2^7:1306-1310 (1990), and the references dted 
therdn. 

The invention is fiirther rdated to nucleic add molecules enable of 
hybridiang to a nuddc add molecule having a sequence complementaiy to or 
faybridiang direct^ to one of the dq)osited cDNAs or the nuddc add sequence 
shown in Figure 6, 14. 16, 21(A"D), 24(B) or 25(B) (SEQ ID N0:1, 3, 5, 7, 9 or 
, 11, respecdvdy) under stringent concfidons. By ''stringent conditions" is intended 
ovenqglit incubaticm at 42''C m a solution comprising: 50% formamide, 5x SSC 
(150 mM NaCi, 15 mM trisodhmi dtrateX 50 mM sodium phosphate (pH 7.6), 5x 
Denhardt's sohition, 10% dextran sulfite, and 20 )ig/ml denatured, sheared 
safanon sperm DNA (ssDNA), followed by washing the filters in O.lx SSC at 
about 65*C 

Examples of variant nuddc add molecules made according to the present 
invention are discussed bdow. The present invoitors have cloned and identified 
a number of MLN 64 gene variants resulting fix>m nudeotide substitutions, 
ddetions and/or insertions. Interestingly, the modifications prin(^)ally occurred 
at exon/intron boundaries, suggesting that the MLN 64 variants result fiom 
defective splicing processes. These variations ofthe MLN 64 goie are described 
in Table VI bdow and indude the following: two substitutions, of a C to T at 
nudeotide 262 and of an A to G at nudeotide 518, chan^ Leu to Phe at ammo 
add 32 and On to Arg at amino add 1 1 7, respective^ (Table VI, variants A and 
B>, a 99 bp ddetion of mideotides 716 to 814, leading to a 33 amino add deletion 
in the MLN 64 protein (i.e., a deletion of ammo adds 184-216, giving a 412 
amino add variant protein) (Table VI, variant C); a 51 bp insertion between 
mideotides 963-964, generating a stop codon 48 bp downstream of the msertion 
site and giving rise to a 281 amino add chimeric C-terminal truncated proteiii 
containing 16 aberrant amino adds at tfie C-terminus (Table VI, variant D); a 657 
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bp inseftion between nucleotides 963-964, generating a 285 amino add chimeric 
C^enninaltmncatedixotdn containing 20 abOTant aniin^ 
(Table VI, variant E); the 99 bp deletion described above and a 13 bp deletion of 
nudeotides 531-543, generating a frameshift leading to 247 anuno add diimeric 
C-terminal truncated protein containing the 121 N-terminal amino adds of MLN 
64 and 1 26 abenant ammo adds at the C-terminal part (Table VI, variant F); and 
a 137 bp deletion of nucleotides 1 15-251 leading to a loss of the imdatiiig ATG 
codon, the 13 ddetion described above and a 1 99 bp insotion downstream of 
nudeotide 715 encoding an N-terminal truncated protein contairung the 138 C- 
temunal amino adds of MLN 64 (Table VI, variant G). 

Based on the above description, generating these seven distinct variants 
A-G and tfiepolypqstidestiiey encode would be routine for one dolled in the art. 
For example, as discussed in detaO in Example 4, bdow, the present inventors 
have doned tfiese variants from d>NA libraries obtained from metastatic a»llary 
lynq>h iKxle tissue, an SKBR3 breast canca* cell line, and nontransformed placrata 
tissue. Moreover, several variants could also be generated by ^e-directed 
mutagenesas of the MLN 64 gene vdiose sequence is shown in Figure 16 (SEQ ID 
NO:5). 

In a fiirther aspect, the present invention is directed to polynudeotides 
having a nudeotide sequence complemratary to the nudeotide sequnce of any 
of the polynudeotides discussed above. 

Expressed Sequence Tags 

An expressed sequence tag (EST) is a segment of a sequence from a 
randomly sdectedd>NA done tiiatconie^nds to a niRNA(Adanis,MJ^ etal.. 
Science 252:1651-1656 (1991); Adams, M.D. ei al.. Nalure 555:632-634 (1992); 
Adams, M.D. et aL, Nat Genet ^:373-380 (1993)). Nine ESTs with at least 
partial homology to a portion of tiie CARTl (MLN 62) nucleotide sequence were 
identified by the present inventors in GenBank (Accession Nos. T64889, T97084, 
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R37445. R61143, T96972, R12544, T40174, R61861 and T41053). The 

— aUgimient of these ESTs rdative to the C^T l-nudeoti^^ 
mKguFe22. 

Twenty-two ESTs with at least partial homology to a portion of the La^ 

1 (MLN 50) nucleotide sequence were identified by the present inventors m 

GenBank (Accession Nbs. T15543, T33692, T32123, T34158, FO430S. T33826, 

T32139, T51225. D12116, Ti51881. T51339, T24771. T10815, T60382, 

M86141, T34342, T08601, T32161, T34065, Z45434. T08349 and F06105). The 

aliment (^tiKse ESTs rdative to the Lasp-1 nucleotide sequence is provided in 
Figure 14(B). 

Fourteen ESTs with at least partial homology to a portion of the MLN 64 
nucleotide sequence were identified by the present inventors in GenBank 
(AooesaonNos. MBS471, T49922, T85470, T85372, R02020, S70803, R02021, 
R17500, R41043, R36697, R37545, R42594, R48774 and R48877). 

Three ESTs with at least partial homology to a portion of the MLN 51 
rmcl«>tide sequence were identified by the present inventors in GenBank 
(Accesaon Nos. Z25173, D19971 and Dl 1736). The alignment of these ESTs 
rdative to the MLN 51 nudeotide sequence is provided in Figure 23. 

Three ESTs with at least partial homology to a portion of the D53 
nucleotide sequence were idoitified by the presoit inventors in GenBank 
(Accession Nos. T89899, T68402 and T93647). 

Isolate4RNAMoleaiUs 

Hie present inventbn fiirther provides isolated RNA molecules wluch are 
m vitro transcrqits ot one of the depoated cDNAs described above, a nuddc add 
sequence shown in Figure 6, 14, 16, 21(A-D). 24(B) or 25(B) (SEQ ID NO: 1, 3. 
5, 7, 9 or 1 l,respectively) or a fiagment thereof Such RNA molecules are usefiil 
as amisense RNA probes for detecting CARTl, Lasp-1, MLN 64, MLN 51, 
niD52, hD52 or D53 gene e3q)resaon by in situ hybriifization. For example, the 
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present inventors have generated a labded antisense RNA probe by m vitro 

— transcription of a J?^/n fi3gment (correspond 

Figure 6 (SEQ ID NO: 1)) of the CARTl cDNA. The RNA probe was used to 
detect CARTl gene expression in malignant epithdial cells and invasive 
caictnomas (5ee i^2^ Example 2). The present inventors also generated a labded 
antisense RNA probe specific for the human MLN 64 cDNA by in vUro 
transcriptioa This RNA probe was used to detect MLN 64 goie expression in 
\^maligna n t qnthdial cdls and invasive carcinomas (see x^a, Exanq>]e 4). 

Pc^fyj^ptides and Fragments Thereof 

\ 

CAStn Pofypeptide 

The invention fivther provides an isolated CARTl polypq>tide having an 
amino add sequence as raicoded by the dDNA deposited as ATCC Dq>osit No. 
97610, or as shown in Kgure 6 (SEQ ID NOi2), or a firagment thereof. The 
CARTl polypqytide, wUch the inventors have shown is localized in the nucleus 
of breast cardnoma cells, is an about 470-residue protein &diibiting three main 
structural domains. First, a ^steine-rich domain was located at the N-terminal 
part of die proldn (annno add residues 18-57 of Figure 6 (SEQ ID NO:2)) winch 
corre^nds to an unusual RING finger moti^ presumably involved in protein- 
protetn Innding. Second, an original cysteine-rich domain was located at the core 
of the protein (amino add residues 83-282 of Figure 6 (SEQ ID NO:2)) and is 
cmistituted by three repeats of an HC3HC3 consensus moti^ possibly involved in 
nuddc add and/or protein-protein Innc&ig, that has been designated as the CART 
motif Third, the Conterminal part of the CARTl protdn corresponds to a TRAF 
domain (amino add reddues 308-470 of Figure 6 (SEQ ID NO:2» known to be 
involved in protdn/protein interactions. 

Sunilar assodation of RING, CART and TRAF domains has been 
observed in the art in the human CD40-binding protem and in the mouse tumor 
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necroas fiu:tor (TOF) receptor-associated fector 2 (TRAF2), both involved in 
- signal transduction mediated by TWF re^ 
r^ulated Dictyostelium Oscoideum DGl 7 proton. This suggests that, togeth^ 
with CARTl, these stmcturaDy related proteins are members of a new protein 
femily and, that CARTl may be involved in TNF-related cytokine agnal 
transduction during breast cancor progresaon. Thus, ^nnce the CARTI DNA 
sequence is provided in Figure 6 (SEQ ro NO:l) as are the regi^ 
the RING. CART and TRAP domains, it would be well within the purview of the 
skilled artisan to generate recombinant constructs amihr or equivalrat to those 
listed bdow. 

As discussed abo^ tiie pxesent inventors have discovered that the 
polypqitide is a prognostic maiker of breast cancer. Thus, this polypq>tide and 
its fragments can be used to generate polyclonal and monoclonal antibodies as 
discussed above for use in prognostic assays such as immunohistochemistry and 
RIA on qrtosol. For example, the present inventors have substantially purified 
lecombinantly produced CARTl and injected it into mice to raise monoclonal 
antibodies. Moreover, a polypq)tide fragment of CARTl, correq>ondmg to the 
sequence to " of Figure 6 (SEQ ID NO:2), has been injected into rabbits 
to raise a polyclonal antibody. 

Lasp-^1 Pofypqrtide 

The invention further provides an isolated La^l polypeptide having an 
amino add sequence as encoded by the cDNA depoated as ATCC Depo^ No. 
97608, or as shown in Figure 14 (SEQ ID NO:4), or a fiagment tiiereof The 
present inventors have discovered that the Lasp-1 polypeptide is an about 261- 
residue protdn exhibiting two main structural domains. First, one copy of a 
cystdne-ridi UM/double zinc finger-like motif is located at the N-temrinal part 
of tfie protein (amino adds 1-51 of Figure 14 (SEQ ID N0:4)). Second, a SHS 
(Src homology region 3) domain is located at the O-terminal part of the protein 
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(amino adds 196-261 of Figure 14 (SEQ ID NO:4)). Lasp-1 is the first protdn 
cxhitMtingassodatedIJMandSH3^d^ 

of a new protein fimily. Thus, since the Lasp-1 DNA sequmce is provided in 
Figure 14 (SEQ ID KO:3) as are the regions wUdi encode the LIM and SH3 
domains, it would be weB wiUun the purview of the skilled artisan to generate 
recombinant constructs sim9ar or equivalent to those listed bdow. 

As discussed above, the present inventors have discovered that the Lasp- 1 
po^eptide is a prognostic marker of breast cancer. Thus, this po^qytide and 
its fragments can be used to generate pofydonal and monodonal antibodies as 
discussed above for use in prognostic ass^s such as immunohistodienustiy and 
RIA on cytosol. 

MLN 64 Pofypepdde 

The invention fitrther provides an isolated MLN 64 polypeptide having an 
amino add sequence as encoded by the cDNA deposited as ATCC Deposit No. 
97609, or as shown in I^gure 16 (SEQ JD NO:6X or a firagment thereof The 
invention also provide polypeptides mcoded for by the seven variants PkrG 
discussed above. These variations ofthe MLN 64 protein are discussed in detail 
in Example 4, bdow. Hie present inventors have discovered that the MLN 64 
protein diown in Figure 16 (SEQ ID NO: 6) is an about 445-residue protdn 
exhibitiqg two potential tran^embrane domains (at residues 1-72 and 94-168) 
and seveial potential leudne zipper and leudneHidirq>eatstructu Aminoadd 
conqpodtion analysis diowed 11.5% aromatic residues (Phe, Trp and Tyr) and 
26%alq>haticreddues(Leu,Ile, ValandMet). Thus, dnce the MLN 64 DNA 
sequmce is provided in Figure 16 (SEQ ID NOrS), it would be well withm the 
purview of the skilled artisan to generate recombinant constructs dmilar or 
equivalent to those listed below. 

The present inventors have discovered that the MLN 64 polypeptide is a 
prognostic marker of breast cancer. Thus, this polypeptide, its fragments, and the 
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polypeptide variants discussed above can be used to generate polyclonal and 
monodonal antibodies fi>r-use in prognostic assays 
and RIA on (^tosoL 

For example, a polypeptide fragment of the MLN 64 protein, 16 anuno 
adds in length located m the C-teniunal part of the MLN 64 protdn, was 
s ynth eazedty the inventors in solid phase uangFmoc chenustiy and coupled to 
ovalbiinun through an additional Nm-es^ 

Infimctional reagent MBS. This symhetic MU>I 64 fragment was injected n^^ 
BALB/c imce periodically until obtention of positive sera. Spleen ceDs were 
rennoved and fiised with n^doniacdls according to St. Groth^ J. 
Immunol Meffi 55:1-21 (1980), Culture superaatants were screened by EUSA 
using the unooigiigatedpqytidefragmmt as antigen. Positive culture media were 
tested by inmmno^toftiorescence and Western blot analysis on MLN 64 cDNA 
transfected COS-1 cdls. Several hybridomas, found to sea-ete monoclonal 
antibodies qiedficaify recognizing MLN 64 protdn, were cloned twice on soft 
agar. Monoclonal antibodies directed against the synthetic MLN 64 peptide 
fragment were enq)loyed in an immunoMstochemical analysis which showed MLN 
64 protein staiiung restricted to transfbnned q)idie^ 

htLNSlPofyp^^ 

Tbe invention fiirdier provides an isolated MLN 51 polypeptide having an 
amino add sequence as encoded by the cDNA deposited as ATCC Depodt No . 
9761 1, or as diown m Figure 21(A-D) (SEQ ID NO:8), or a fragment fliereof 
Hie present inventors have discovered that the MLN 51 polypeptide is an about 
534-residue protdn. Thus, since tfie MLN 51 DNA sequence is provided in 
Figure 2 1 (A-D) (SEQ ID N0:7X it would be weH within die purview of tiie skilled 
artisan to generate recombinant constructs sunilar or equivalent to those listed 
bdow. 
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As disciissed above, the present mventors have discovered that the MLN 
- 51 polypqjtide is a prognostic iiuulcer of breast can^ 
its fragments, and the polypeptide variants discussed above can be used to 
generate polyclonal and monoclonal antibodies for use in prognostic assays such 
as immunohistodiemistry and RIA on cytosoL 

D53 Pofypeptide 

The invention furthor pro^des an isolated D53 polypeptide having an 
amino add sequnce as encoded by the cDNA deposited as ATCC DefX)^ No. 
976a7,orasdiowninHgure24(B)(SEQmNO:10Xorafiagmentt^ The 
present inventors have discovered that iheD53 po^pqytideis about 204 amino 
adds in le^gtii and have identified a single coiled-cofl domain in hDS3 , as wdl as 
in the hDS2 homolog and mouse D52, towards the N-terminus of each protdn, 
which is predicted to end at Leu^ in all 3 proteins. This coiled-coil domain 
overlaps with the leucine zipper predicted in hDS2/N8 usmg helical whed analysis. 
The presence of a coiled-coil domain in D52 femily proteins indicates that spedfic 
piotdn-protdn interactions are required for the fimctions of these protdns. The 
present inventors have identified the presence of 2 candidate PEST domains in the 
three protdns, hD53» hDS2 and mDS2, indicating that thdr intracellular 
abundancesmay be m part controlled by proteolytic mechanisms. Interestingly, 
the ^ent of the N-terminaUy located PEST domain overlaps that of the coiled- 
coil domain in both DS2 and D53 proteins. It could thus be envisaged that 
interactions via the coiled-coil domain could mask this PEST donuun, in 
accordance with the hypothecs that PEST sequaices m^ act as conditional 
proteolytic signals in protdns able to form complexes (Rechstdno^, M., Ad^. 
Enzyme Reg. 27:135-151 (1988)). Also, the sequences of the three protdns 
contam an unevra distribution of diaiged amino adds; v^e approximatdy the 
first and last 50 amino adds of eadi protdn exhibits a predonunant negative 
diaige, the central portion of each protein exlubits an excess of positively charged 
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readues. Finally, the present inventors have identified amOar potential post- 

trandational modification sites in the three proteins. 

The present inventors have discovaed that the D53 polypqxtide is a tumor 
marker in breast cancer. Moreover, relative hD52/hD53 gene expression levels 
are useM as a maricer fi)r distiAgpishiAg between diflRaert fbrais of leuk^^ 

Murine D52 PdOfpepOde 

The uwemion fiirther provides an isolated niDS2 polypqitide having an 
amino aqd sequence as shown in Rgure 2S(B) (SEQ ID NO:12X or a fiagn^ 
^thereof . The present inventors have discovered that the niDS2 pofyp^de is an 
about 185 anuno acid residue protein having domain features as desoibed above. 

Pofypeptide Fragments and Variants 

Fragments of CARTl, Lasp-l, MLN 64, MLN 51. mD52 or D53 other 
than those described above capable of raising both monodonal and polydonal 

antibodies will be readily q>parem to one of skiO in the art and \viD geneialiy be 
at least 10 anuno adds, and preferably at ledst 15 annno adds, in length. Fw 
example the "good antigen" criteria set foitii in Van Regenmoitd et aL. 
LmnmoL Utters J7:9S-10i (1988X could be used for sdecting fiagments of die 
CARTl, Lasp.1, MLN 64, MLN 51, mD52 or D53 protein ciq»ble of laisiiig 
moixiclonal and poiydoaal antibo^. 

It will be reoc^guzed in die art that some amino add sequences of CARTl, 
Lasp-l, MLN 64, MLN 51, mD52 or D53 can be varied witiwut significant efiect 
on the structure or fimction of tiie protein. If such diflFerences in sequence are 
contemplated, it duNdd be remembered tiiat there wiU be critical areas on tiie 
protemviM determine activity. Such areas will usuaDy comprise readues which 
make up the bmding site, or \N*ich fonn t«tiary structures \n*ich aflfect die 
binding site. In general, ft is possible to replace residues vMdi form the tertiaiy 
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stnicture, provided that residues performing a similar function are used. In other 

instances,-the-^e of residue.may-be completely-unimportant if-A^ 

occurs at a noncritical region of the protein. 

Thus, the present invention fiirther includes variations of the CARTl, 
5 Laq>-1, MLN 64, MLN 5 1, mDS2 or D53 protdn which show substantial protein 

activity or wUch include regions of the CARTl, Lasp-1, MLN 64, MLN 51, 
mDS2orDS3 protein sudi as the protdn fragments discussed above cqHible of 
raisii^ antibodies usefid in imnriunohistodianical or Sudi mutants 

indude ddetions^ insertions, inver^ons, rq>eats and type-substitutions (e.g., 

10 substituting one hydroi^c residue for another, but not strong^ faydroplnlic for 

strongly hydrophobic as a rule). Small dumges or such "noitral" amino add 
substitutions wSl generatty have littie effect on activity. 

Typically seen as consovative substitutions are as follows: tiie 
iqdacements, onefor anotiier, among the aliphatic amino adds, Ala, Val, Lai and 

IS lie; interchange of the hydroxyl residues, Ser and Thr, exchange of the addic 

readues. Asp and Gin; substitution between the amide residues, Asn and dn; 
exchange of the bade rendues, Lys and Ar^ and replacements among the 
aromatic residues, Phe, Tyr. As indicated in detail above, forther guidance 
concerning vAddti amino add changes are likely to be phraotypically sdient 0-^., 

20 are not likely to have a dgnificant deleterious efifect on a fonction) can be found 

mBowie, J.U. etaL, Science 247:1306-1310 {1990). 

Preferably, such variants will be at least 90%, 9S%, 97%, 98% or 99% 
identical to tiie CARTl, Lasp-1, MLN 64, MLN SI, mDS2 or D53 polypeptides 
described above and also indude portions of such polypeptides with at least 30 

25 anuno adds and more prderably at least SO anuno adds. By the invention, "% 

identity between two polypqitides can be detemuned using the '*&stA" computer 
algoritimiMtii the de&uh parameters (Pearson, W.R. & Lipman, D.J., Proa NatL 
Acad ScL USA 85:2444 (1988)). 

The isolated CARTl, Lasp-1, MLN 64, MLN SI, mD52, or D53 

30 polypeptide, or a fragment thereof are preferably provided in an isolated form. 
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and preferably are substantially purified. Of course, purification methods are 
_knoiram_the ait _In prefer^ 
the CARTl, Lasp-1, MLN 64, MLN 51, inD52 or D53 polypeptide is 
substantially purified by the one-step method described in Smith and Johnson, 
Gene tf7;31-40 (1988). The CARTl, Lasp-1, MLN 64, MLN 51, ihD52 or D53 
protrin can be recovered and purified firom recombmant cell cultures fay wdl- 
known methods induding ammonhun sul&te or ethanol predpitation, add 
extraction, anion or cation exchange chromatogr^hy, phosphocelhilose 
cfaromatogrqplqr, hydrophobic interaction chromatography, afSmty 
chroniatogrq>hy, hydrox^apatite diromatograpl^ and lecdn chromatography. 
Most preferably, li^ perforaaance liquid chromatogrq>fay ("HPLC") is enq)loyed 
for purification. Polypeptides of the present uwention indude naturally purified 
products, products of diemical synthetic procedures, and products produced by 
recombinant techmques from a prokaryotic or eukaiyotic host, including, for 
example, bactmal, yeast, higher plant, insect and inammafian cells. Depending 
upon the host employed in a recombinant production procedure, the polypeptides 
of the present invration may be ^cosylated or m^ be nonglycosylated. In 
addition, polypqjtides of the invMtion may also mdude an initial modified 
methionine re^due, and in some cases as a resuh of host-mediated processes. 

Vectors and Hosts 

The present invoition also relates to vectors vAadk indude an isolated 
DNA molecule(s) of the present invration, host cells idiidi are genetically 
cngmeered with the vectors, and the production of CARTl, Lasp-1, MLN 64, 
MLN 51, mD52 or D53 polypeptide(s), or firagments thereoi^ by recombinant 
tediniques. 

A DNA molecule, preferably a cDNA, encoding the CARTl, Lasp-1, 
MLN 51, MLN 64, mD52 or D53 polypeptide or a fi-agment thereof may eaaly 
be mserted into a suitable vector. Ideally, the vector has suitable restriction sites 
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for ease of insertion, but blunt-md ligation, for example, may also be used, 

il4iMgh_this_nmy_lead„to_uncatainty_^^ 

insertion. In such an instance, it is a matter of course to test transfonnants for 
expression; 1 in 6 of which should have the correct reading frame. 

The CARTl, Lasp-1, MLN 51. MLN 64, mD52 or D53 polypeptide(s), 

^ or fragments th^eof, can be expressed in any suitable host cell. The extent of 
expresdon may be analyzed by SDS polyacrylamide gel dectrophore^s 
O-aemmeffi, etaL, Naiure 227.-680.685 (1970)). Cultures useBiI for production 
of sudi polypeptides include pndcBiyoti(^ 

Preferred systems include K coU, SirepUmyces and SabnoneBa typhimurium and 
yeast, manmialian or plant cells. Mammalian hosts include HeLa, COS, and 
Chinese Hamster Ovaiy(CH0)cdls. Yeast hosts include 51 c^evisuie. Insect 
cells include J>osophilq S2 and Spodoptera SB cdls. Appropriate culture 
mediums and conditions for the above-described host cells are known in the art. 
Vectors capable of directing esqires^on in the above-mentioned host cells are also 
known in the art. 

The present inventors have designed the following recombuiant DNA 
e?q)resacm constructs whidi encode other the entire CARTl protdn or fragments 
of the CARTl protdn conesponding to the individual domains discussed above. 
Bacterial expression systems are as foUows: pCiEX-CARTl; pGEX-RING; 
pGEX-CART; pGEX-CART-TRAF; and pGEX-TRAF. Yeast expression 
q^stons are as follows: ifiTMN-<:ART-TRAF; pBTMN-CART; pBTNfN-TRA^ 
pVP-CART-TRAF; pYP-CART; and pVP-TRAF. Eukaiyotic expression systems 
are as follows: pSGS-CARTl, pAT3-CARTl; pAT4-CARTl; pBC-CARTl; and 
pCMV-CARTl. 

For example, by pAT4-CARTl, is intended the pAT4 vector contaimng 
the entire CARTl DNA coding sequence as an insert. Similarly, by pBTMN- 
CART-TRAF, is intended the pBTMN vector contaimng the DNA sequmce 
encoding the CART and TRAP regions of the CARTl protein. The rmiaining 
constructs listed above are to be int^reted in a like-manner. The pGEX, 
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pBTMN, pVP, pSG5, pAT3, pAT4, pBC and pCMV vectors are known in the art 
~and pubGclya^^abler 

The present inventors have designed the foUoiwiig recomlniiaiit DNA 
esqpresidoncoastnictsivfaichenradedthertheet^^ protein or fiagments 
of the Lasp-1 protein. Bacterial expression systems are as follows: p(3EX- 
LASPl;pGEX-LIM;andp(^-SBB. Yeast expression systems are as foUows: 
pBTMN-LASPl; pBTMN-LIM; pBTMN-SH3; pVP-LASPl; pVP-LIM; and 
pVP-SH3. Eukaryotic&qiresaon systems are as foDows: pSG5*LASPl; pBC* 
LASPl; and pCMV-LASPL The pGEX, pBTMN, pVP, pSG5, pBC and pCMV 
vectors are known in the art and pubUdy available. 

The presrat inventors have deigned the following recombinant DNA 
esqpresfflon constructs which encode the MLN 64 protdn. Bacterial expresaon 
systems uidude pGEX-MLN 64. Eukaryotic expres^on systems include pSGS- 
MLN64andpBC.MLN64. The pGEX, pSG5 and pBC vectors are known and 
publicly available. 

Having genoBlly described the invention, the same will be more readfly 
understood through reference to the following exanq)les lii^ch are provided by 
way of ilhistration and are not intended to be limiting. 
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Experiments 

Example 1 

Uentification of Four Novd Human Genes Amplified and Overexpressed 
in Breast Carcinoma and Located to tiie qll-q21.3 Region of 
Ciiromosome 17 

introduction 

Desphe eariier detection and a lower aze of the primaiy tumoi? at the ti^ 
of diagnoas (Nystroni, L. e/oi. Lancet J-/7:973-978 (1993); Fletch^, S.W. et cd., 
J. Nail Cancer Inst, 55:1644-1656 (1993)X associated metastases remain the 
major cause of breast caiicer mortality (Frost, P. & Levin, R., Lancet 359:1458- 
1461 (1992)). Therefore, defiiung tiie mechanisms involved in die formation and 
growdi of metastases is still nuyor challenge in breast cancer researdi (Rusdano, 
D. &Buiger, M.M, BioEssays 7-/:185-194 (1992); Hoddns, K & Weber, Bi., 
Curr. Opin. OncoL lf:554-559 (1994)). The processes leading to the formation 
of metastases are con^jlex ^idler, U., Cevicer Res, 50:6130-6138 (1990); Liotta, 
L. etoL^ Cell 64327-336 (1991)), and idrad^g the related molecular events is 
thus critical for the selection of optimal treatments. 

The initial steps of transformation characterized by the malignant cdl 
escape &om normal cell cycle controls, are driven by the expression of dominant 
oncogenes and/or the loss of tumor suppressor genes (Hunter, T. & Pin^ I, Cell 
79:573-582 (1994)). Tumor progression can be con^dered as the ability of the 
malignant cells to leave the primary tumoral site and, after migration through 
lymphatic or blood vessels, to grow at a distance in host tissue and form a 
secondary tumor (Rdler, LJ., Cancer Res. 50:6130-6138 (1990); Liotta, L. etoL, 
Cell 6^:327-336 (1 991)). Progression to metastasis is depradent not onty upon 
transformation but also upon the outcome of a cascade of interactions betwem the 
malignant cdls and the host cdls/tissues. These interactions may reflect molecular 
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modification of synthesis and/or of activity of difierent gene products both in 
malignant and host cells. Several genes involved in the control of tumoral 
progression have been identified and shown to be inq>licated in cell adhe^on, 
extracellular matrix degradation, immune survdllance; growth fictor synthesis 
and/or angiogenesis (reviewed in. Hart, LR. & Saini, A., Lancet 339: 1453-1461 
(1992); Ponta, H etoL.B^jL IJ98:U10 (1994^ Bemstdn, LJL & Uotta, LA, 
CwT. Opbt OncoL tf:106-113 (1994); Brattain, M.G. etcd., Curr. Opm. OncoL 
5:77-81 (1994); Fidler, IJ. & EUis, LM, Cell 79:185-188 (1994)). 

In oido- to identify and done genes which could be involved in the cancer 
progression, we pCTformed a cfiflferential screening of a cDNA library established 
firom breast cancer derived metastatic axillary lymph nodes (MLN). In breast 
cancer, axillaiy tynq)h nodes are usuaify the earliest rites for metastasis formation, 
and th^ are routinely removed for diagnostic purposes (Carter, CX. ei aL^ 
Cancer 63:1%\AZ1 (1989)). Systemic metastases will usually occur lata* on in 
15 the disease, princii»% in bone, brain and visceres Qlusdano, D. & Burger, MM., 

BioEssqys 7¥: 185-194 (1992)) and, because there is no benefit in terms of survival 
fi)rtfie patients, they are rardy removed. Sinular differential scremmg protocols 
have alrea^ permitted the identification of sevml genes possibly invoh^ in 
lumor progresrion, inclucfing the stromefysin-3 gene n^ch is overexpressed in 
20 most mvasive breast carcinomas (Basset, P. ef aL^ Nature 348:699-704 (1990)) 

and the ma^in genet, v^Kise expression is reduced in breast cancer cell lines (Zou, 
Z.etid, Science 263:526-529 (1994)). In tiie present study, the screenmg of the 
MLN cDNA libraiy was perfi>rmed using two probes representative of malignant 
(MLN) and of nonmalignant (fibroadenomas; FA) breast tissues, respectively. 
25 Mrtastatic samples were obtained fix)m patients harboring clinical and histolo^cal 

characteristics associated with a poor prognosis and a high propensity of 
metastatic spreading. FAs, which are benign tumors, have been selected as 
control tissues ance, although nonmalignant, they are proliferating tissues, thereby 
nmunuzing the probalnlity to identify 11^^ diaracteristic of cellular growth, but 
30 un rd ated to the malignant process. 
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Ifae we report the identification of four novel genes, co-localized on the 

chromosonie 17 long ann^and amplified and overexpressed in-n^ 

tissues. 

Materials and Methods 

Tissues and CeB Cultures 

Sm]^calqpedniens obtained at the HopitauxUmv^ Strasbouig, 
were fiozen in fiquid nitrogen for lO^A extraction. Adjacent sections were fixed 
in 10% buffered formalin and paraflSn mibedded for Ustolo^cal examination. 

The cell lines (ZR75-1, MCF7, SK-BR.3, BT-20, BT-474, HBL-100, 
MDA-MB231 and T-47D) are described and available in the American Type 
Cukure Collection (ATCC,RockviUe,MD). The lines MCF7, ZR75-1, BT-474 
and T-47D arc estrogen receptor positive, whereas BT-20, SK-BR-3 and MDA- 
MB-23 1 were estrogen receptor negative. Cells were routinely maintained in our 
laboratoiy and were cultured at confluency in Dulbecco^s modified Eagle's medium 
supplCTimted with 10% fetal calf serum. 

RNA PrepanAon andAmdysis 

Suipgical qjedmms were homogenized in the guanidinium isothiocyanate 
lysis buflfer and purified by ceiitrifiigation through cesmm ddoride cusluon 
(Chirgwin, IM. el at, Biodmmstry 7*52-94 (1979)). PolyA* RNA was purified 
usng ofigodT ceSulose chromatography (Aviv, K & Leder, P., Froc. Natl Acad. 
Sci. USA 5P:1408-1412 (1977)). RNAs fi-om cultured ceU lines were extracted 
u^ the an^e-step procedure of Chomczynski, P. & Sacchi, N., Anal. Biochem. 
752:156-159 (1987)). RNAs were firactionated by electrophoresis on 1% agarose, 
2.2 M formaldehyde gds (Lehrach, YLetaL, Biochemistry 76:4743-475 1 (1977)), 
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transfeiml to nylon membrane (Hybond N, Amersham Corp., Arlington Heights^ 
IL) and immobilized by baking for — 

cDNA library Consirucdon 

PotyA^ RNA from four independent surgical specimens of breast cancer 
MLNs weie pooled. The cDNA was syntheazed uang MMLV reverse 
transcriptase (Superscript™, Gibco BRI^ Gaithersbuig, MD) and oligodT 
(PhannadaFme Chemicals, Kscataway,N]) as primer. Second strand ^tfaeas 
was performed by RNaseH replacement (Gubler, U. & Hoflfinan, BJ., Gene 
25:263-269 (1983)). After blunt-ending using T4 DNA Polymerase I, EcdKl 
adqrtors were added. After ligation, excess of adaptors and molecules less than 
300 bp were removed by gd filtration d]romatognq)liy on Biogel ASOm (Bio-Rad, 
Richmond, CA). Size selected cDNAs were ligated in the £coRI cloning site of 
lambda ZAPH (Stratagene Inc., La JoUa, CA). 

Probe Preparation 

In order to obtain a MLN spedfic probe (plus probe), 3 \ig of polyA* 
RNA purified fix>m MLN were subjected to first strand cDNA synthesis and 
370 ng of cDNA were obtained by oligodT primmg. RNA molecules were 
removed by NaOH hydrolysis and single-stranded cDNA was hybridized to 7 jig 
ofpoIyA* RNA purified from a breast FA (19x excess). After hybridization for 
24 hrs at 68»C (Hedrick, SM. et oL, Nature 305:149-153 (1984); Rhyner, T.A. 
etaL,J.NeuroscL Res. 75:167-181 (1986)), single-stranded material (12% of the 
starting cDNA) was purified by hydroxylapatite chromatography (Bio-Rad, 
Richmond, CA). The minus probe, derived from a breast FA, was amOariy 
obtained fiom 5 jig of polyA"^ RNA which were converted mto 560 ng of single- 
stranded cDNA and hybridized to 7 jig of normal colon and liver (20x excess). 
After hydro)qrlapatite chromatography, 14% of the cDNA remained single- 
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stranded. In both cases, sin^e-stranded cDNAs were concentrated and washed 
whh T|,^ uang Centricon 30 (Amicon, Beverly, MA). Twenty ng and 40 ng of 

pIus_and_ininus probesjwere obtained, respectively.^ 

(Fcinbeig, AJ>.&Vogelstdn,B.,^#iai: Biochem. 772:195-203 (1983)) of lOng 
of angje-stranded cDNA gave 2x10^ and SxlO' cpm/jig of plus and minus probes, 
respectively. 

cDNA LibroFy Screening 

Qnefaundred thousand pfiifixim the MLN library were plated, andn^on 
fihcar r)q>lica (Biodyne A transfer membrane. Pall Europe limited, Portsmouth) 
were hybridized at 42X in 50% fbrmamide, 5x SSC, 0.4% ficoll, 0.4% 
pdyvinylpymolidone, 20 niM sodium phosphate; pH 6.5, 0.5% SDS, 10% dextian 
siit&te and 100 pgAnl denatured sahnon sperm DNA^ for 36-48 hrs, with the 
labeled plus or minus probes diluted to 0.5-1x10^ cpm/ml. Stnng^it washiqgs 
were performed at eO'^C in O.lx SSC and 0.1% SDS. Filters were 
autonidiogFq)hed at -SO^'C for 24-72 hrs. Plaques ^ving differential signals with 
tiie plus and ininus probes were picked up and subjected to a secondary screening 
using the same hybridization conditions. 

Ptasnud Recovery and Southern JNoi Analysis 

Pure plaques were directly recovered as bacterial colonies using the 
pBhiescript/AZAPn in vivo excision system (Stratag^e Inc., La Jolla, CA). Small 
scale plasmid extractions were performed (Zhou, C. et oL, Biotechniques 8:172- 
173 (1990)) and approximatdy 1/10 of the material (200 ng) was digested with 
EcoSl and loaded on 2 agarose gels» run in paralld. Aft^ electrophore^s, gels 
were blotted onto nylon membranes (Hybond JT, Amersham Corp.) and 
meoibranes were hybridized to the plus and minus probes. Inserts from selected 
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clones were purified firom agarose gel and ^^-labeled by random priming, and 
^used for NorthOTi and Southern blot analyses and ero 

Sequencing and Congf liter Anafysis 

Plasmid templates, prepared as previously described, weie treated with 
KNaseA. (10 pgM) fiir 30 min, then predpttated by 0.57 vohmie of pol^etlqiene 
^yccd NaCl (20%, 2 M), washed with ethanol, vacuum-dried and resuspmded at 
200 ng^pl in TioE^. The double-stranded DNA templates were sequenced with 
Taq polymerase and eitho* pBIuescript universal or mtensal primers, u^g dye- 
labeled ddNTPs for detection on an Applied Biosystems 373A automated 
sequencer. Sequence analyses were performed uang the GCGsequmceanaly^ 
padcage (Vi^sconan package, v^on 8.0, Genetics Computar Group, Madison, 
W^. Sequence homologies were identified uang the Fast A and Blast programs 
by seardiing the complete combined GenBank/EMBL databanks (release 
84.0/39.0) and in the case of translated sequences, by searching the complete 
SwissProt database (release 29.0). 

Genomic DNA Extraction and SimAern Blot Anafyas 

Cells were grown in 75 mm^ flasks at confluency, and washed with Ix 
PBS. After addition of 2 ml of extraction buffer (10 mM Tris-HCI, pH 8.0, 0.1 
M NajEDTA, pH 8,0, 20 pg/ml KNaseA, 0.5% SDS, 100 pg/ml proteinase K), 
the flasks were mcubated at 42X for 12 hrs. Genomic DNA was recovered by 
precipitation with 1 vohmie of isopropanol. After washing in 70% ethanol, DNA 
was air-dried and dissolved in TioE, at A^'C, For DNA amplification studies, 
10 pg of cell line genomic DNA were BamHl digested until completion. For 
cfaiomoscmial localization, DNA extracted fiom human/rodent somatic ceD hybrids 
(NIGMS M^[ring pand #2; Corien Cett Repositories, Camden, NJ) digested with 
BamHl or EcoRl until completion was used. In both cases, BamHL or £coRI 
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digested genomic DNA was fractionated on 0.8% agarose gel and blotted onto 
— Hybond N^membnmes. -Quantitation of MLN gene copy ni^ — 
lines was detemined by do&lot analysis. Genomic DNA (2.5 fig) was denatured 
in 0.4M NaOH at 65 ""C for 1 hr and 2-fold serial dilutions w^re spotted onto 
Hybond>r membranes. Hybridization and washing were performed as described - 
fisrcDNAfibraiy screening. Control probe p53 corresponded to a 2.0 kb JSkimHI 
fragment released fiiom plip53B (ATCC No. 57254). RNA loading control 
\^ suitable for lumianceik and tissues was an internal (^^ 
(Masiakowski, P. etaL, Nucleic Acids Res. 70:7895-7903 (1982)). 

i GmeMiiqpping 

Chromosomal asagnment of genes MLN 50, 51, 62 and 64 was carried 
out by in situ hybridization on chromosome preparations obtained from 
phytohemagglutinin-stimulated human lymphocytes, cultured for 72 hrs. 
5<-Bromodeo}qruridine (60 (ig/ml) was added to the medium for the final 7 hrs of 
culture to ensure posthyforidization diromosomal banding of good quality. cDNA 
probes were ^-labeled by nick-translation to a spedfic activity of 1.5x10 ' 
dpiofixA. The radiolabded probes ware hybridized to metaphases spreads at a final 
concCTtration of 25 ng/ml of hybridization solution, as pre>dously described 
(Mattel, M.G. et aL, Human Genet 59:268-271 (1985)). After the sUdes were 
coated with nuclear track emulaon (NTB2; Kodak, Rochester, NY), they were 
exposed fi)r 19 days at 4''C before development. To avoid any sUpping of diver 
grmns during tile banding procedure, chromosome spreads were first stained v^ 
buffered Giemsa solution, and met^hases were photographed. R-bancfing was 
tiim perfi)rmed by the fluorochrome-photoly^s-Giemsa method, and metaphases 
were rephotographed before analysis. 
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D^erential Screening of the MLN cDNA Library 

Four patients with ductal breast cardnonias were selected acconUqg to 
thdr age (below 50 years of age), the laige size and hi^ lustolo^cal grade of 
their primaiy tumor (Bloom, H. J.G. & Ridianisoii, W.W^ Brit J. Cancer 77:359- 
366 (1957)) and the presence of MLN (Table I). Because of the high 
heten^eneity of breast tumors (Lonn, U. et aL, bOL J. Cancer 58:A0-AS (1994) 
and refe. theranX RNAs were extracted from metastatic samples coming from the 
four patients and pooled m rdative equal amounte, m order to prepare a 
representative breast MLN cDNA Ubraiy. Histolo^cal ccammation of the 
selected MLN san^>les revealed above 80% of metastatic tissue. Howev^, m 
order to avoid dihition <rf rare difierentia] transcripts, we prq)ared the ouiched 
phis probe usmg MLNs exduaveJy obtained from patient C. This patient had 17 
mvoNed lymph nodes (Table 1), and, in addition, her primary tumor exhibited two 
poor prognostic &ctors \^ch were an estradiol and progesto-one receptor 
negath^ status (Osborne, C.K. et oL, Receptors, in BREAST DISEASES 301-325 
(2nd ed., Harris, JJL ettd., eds. JB. Lippmcott, PWladelplria, PA 1991)) and a c- 
«r6B-2 ovmxpresaon (Slamon, D.J. et aL, Science 2-^:707-712 (1989); Boig, 
A etoL, Oncogene <y:137-143 (1991); Toikkanen, S. ettd., J. Clin. OncoL «:103- 
1 12 (1992); Muss, HB. et aL, N. EngL J. Med. 300: 1260-1266 (1994)). 

A total of 10* reconitnnants fiom the MLN cDNA libraiy were 
differentially screened using two enriched probes. The phis probe was derived 
fiom MLN cpNAs and deprived of sequences expressed in a FA The "minus" 
probe was derived fiom FA cDNAs and deprived of sequences expressed in 
nonnal liver and colon (see Materials and Methods). Comparison of the patterns 
obtained with these two probes allowed for the detection of 195 "diflfa^ential 
plaques" viudi were poatwe vwlh die "plus" probe and native with the "minus" 
probe. Twenty four (fifferentia] plaques were subjected to a second screening and 
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plasmid DNAs recovered firom pure plaques were tested for the presence of 
**di£feriratial inserts" by Southern blot analysis {see Materials and Methods). 
Identified ^eraitial ins^ were ^-labded and used to reprobe the MLN cDNA 
library fifts and the Souths blots in order to identify related cDNA clones. The 
same protocol was used to characterize the remaining "differential plaques" and 
finally, ten indq)endentfimiKesof(fifferentialdones were identifi^^ The longest 
d>NAinsert of each&nily ^4LN4, 10, 19, SO, 51, 62, 64, 70, 74 and 137) were 
selected far fiuther studies. 

Expression Anafym of Ae Ten MLN Genes 

In order to test the differential expres^on of the genes corresponding to 
these clones. Northern blots were prepared using MLN, FA and normal axillary 
lynq)h node (NLN) RNAs. Filters were hybridized with the ten ^-labeled MLN 
cDNAs. As shown in Figure 1, all detected mKMAs were prefo^ntially obs^ed 
in MLN (lanes 1) whereas no signal or only a faint signal was observed in NLN 
andFAQanes2and3). The mRNA azes^ detected by the ten probes, varied fi*om 
0.5 kb (MLN 70) up to S Id> (MLN 74) indicating that our screening protocol did 
not &vor a preferential transcript size. Although the expres^on levels differed, 
they remained relatively high, even for the least abundant of them (MLN 62) 
(Figure 1). 

cDNA and PMative Pfoldn Sequences ofAe Ten MLN Genes 

In a first stq), cDNAs were partially sequenced on both extremities u^g 
universal primers for the pBluescript vector. These partial sequences wm 
conqiared to the combined GeneBank/EMBL DNA databanks. MLN 74, 19, 10 
and 4 corresponded to the afaready known genes fibronecdn (Accession Nos. 
X02761, K00799, K02273, X00307 and X00739; Komblihtt, A.R. et al., EMBO 
J. 3:221-226 (1983)), c-er6B-2 (Accession No. Ml 1730; Coussens, L. et aL, 
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Science 230:1132-1139 (1985)), nonspecific cross-reacting antigen (NCA, 
Accession No. Ml 8728; Tawaragi. Y. et aL, Biochem. Biophys. Res. Commun. 
750:89-96 (1988)) and calcydin (Accession Nos. M14300 and J02763; Calabretta, 
B. etaL, J. BioL Chem. 25:12628-12632 (1986)X respectively. Altogether they 
were tfie most abundant dones recovered in this sereening anca, a& indicated in 
Table D, they represented 75% of the diffirentialclones. The relationshq) of these 
genes to cancer and, for some of them to metastaas, has been afaeady reported. 

a secmid stq>, when no sequence homology was initial^ found, ^ 
complete cDNA sequences were established and the {wtative corresponding 
protein sequences were conqrared to those present in the SvdssProt rfataimnir 
MLN 70 (Accession No, X80198) and MLN 137 (Accession No. X80197) 

showed homcdp^ with proteins fiom o^ species and could be classified in the 
SlOO and keratin fimiilies (BOigman, D. & Kh, D.C., Trends BioL Sci. 73:437-443 
(1988); Donato, R., Cell Calcium 72:713-726 (1991); Smack, D.P. et aL, J. 
Amer. Acad DermatoL 50:85-102 (1994)), respectively. The 30 amino add long 
ZF-1 pig qrsteine-rich peptide (Accession No. P80171, Sillard, R. et aL, Eur. J. 
Biochem. 277:377-380 (1993)) showed 100% identity to theN-tenninal part of 
the MLN 50 putative protein (Accession No. X82456). In addition, several 
sequence homologies were found with various expressed sequence tags (ESTs; 
Adams. MD. etaL, Nature 535:632-634 (1992)) within the 3' noncodiqg regions 
of the MLN 50 (Accession Nos. T08349, T08601 and M86141, Adams, MJ>. 
etaL, Nature 335:632-634 (1992); Adams, MD. etaL, Nat. Genet ^.373-380 
(1993); T10815. Bdl, G.L & Takeda, J., Hum. MoL Genet 2:1793-1798 (1993); 
D12116, Okubo, K. et aL, Nat Genetics 2:173-179 (1992)) and MLN 51 
(Accession No. X80199; EST Accession Nos. Z25173 and D19971. Okubo, K 
et aL, Nat Genetics 2:173-179 (1992)) dDNA sequences. Surprisingly, we 
observed 100% homology with part (129 bp) of an 401 bp long EST (Accession 
No. M85471, Adams, MD. et aL, Nature 555:632-634 (1992)) and the 5 ' coding 
region of MLN 64 (Accesaon No. X80198), suggestbg that this EST could 
correspond to a chimera or to an un^liced RNA. Since most homologies 
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observed for MLN 50, 51 and 64 were restricted to small noncoding DNA 
sequences and since no homology was found for MLN 62 (Accession No. 
X80200X we assumed that they belong to new protein &niilies and further 
characterizations were undertaken. 



OtroniMomai Assignment of MLN SO, SI, 62 and 64 Genes 

SouthOTi blots were constructed by loading EcdBl or BamHL digest of 
genomic DNAs from human somatic cell hybrids, corresponding to individual 
liunmdircHm>sQnie in a lodent background. MLN 51 and 64 probes showed an 
umque hybridization signal on dux>mosome 17, whereas MLN 50 and 62 probes 
showed a strong hybridization to chromosome 17 and a fiiint agnal on 
diroihK>somes 3 and 16, and on diromosome 5, respectively (Table HI). Sincethe 
four probes showed hybridization with chromosome 17, the same Southern blot 
was nprobed with MLN 19 corresponding to the c-erAB-2 oncogene, previously 
localized on the chromosome 17 (Fukushige, S.L ei al., Mol Cell BioL 6\9SS- 
958 (1986)). As expected, MLN 19 showed a hybridization restricted to tlds 
diromosome (Table m). 

In order to define the precise location of the four new genes on 
chromosome 17, we carried out chromosomal in situ hybridization. Usuig MLN 
50, 100 metaphase cells were examined. 276 silver grains were assodated with 
the chromosomes and 83 of these (30%) were located on chromosome 17. The 
distribution of gruns was not random: 65/83 (78.3%) of them mapped to the 
ql l-q21 region of the long arm of chromosome 17 CFig. 2(A)). Two secondary 
sates were detected, at 3p22-3p21.3 (36/276, 13% of total grans) and at 16ql2.1 
(26/276, 9.4% of total grains). Using MLN 51, 100 metaphase cdls were 
examined. 176 alver grains were assodated with the chromosomes and 60 of 
these (34.1%) were located on chromosome 17. The distribution of grains was 
not random: 49/60 (81.6%) of them mapped to the ql2-q21.3 region of the long 
arm of chromosome 17 (Fig. 2(A)). Using MLN 62, 150 metaphase cells were 
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examined. 204 alver grains were associated with the chromosomes and two sites 
of hybridization were detectable. 20.1% were located on chromosome 17 and 
82.9% of than ma|q)ed to the ql l-ql2 region of the long ann (Fig. 2(A)). 16.6% 
were located on chromosome S. The distribution of grains was not random: 
79.4% mapped to the (q3 l-q32) r^on of chromosome 5 long arm. Using MLN 
64, ISO metaphase cells were exanuned. 247 siiver grains were assodated with 
chromosomes and 64 of these (25.9%) were located on duomosome 17. The 
distribution of grains was not random: 73.4% of them mapped to tiieql2-q21 
ripgionoftfiebi%armofd]romosomel7mthamaxnmmiintheq^^ band (Fig. 
2(A)). These results are in good agreement with the findings previously obta^ 
by Southern blot hybridization and suggest that, along the long arm of the 
chromosome 17, MLN SO and 62 and MLN 51 and 64 are centromeric and 
telomoic to MLN 19 (prerbB-l), respectively (Fig. 2(B)). 

AmpttficiOion and Expression of MLN 50, 51, 62 and 64 Genes 

Five of the cDNA clones isolated in this study corresponded to genes 
located on the chromosome 17, namely MLN 50, 51, 62, 64 and 19. Moreover, 
they arc all localized on the long arm of diromosome 17 in the ql l-q21.3 re^on. 
Since it is known that o-€r&B-2 overe;q>ression in breast cardnomas is mostly 
dq>e&dat on gene amplification (Slamon, D.J. et oL, Science 255:177-182 
(1987); van de Vijver, M. etaL.MoL CelL Biol 7:2019-2023 (1987)), we looked 
for MLN SO, 51, 62 and 64 gpie amplification. Eadi of them showed 
amplification in 10-20% of sporadic breast cardnomas (data not shown). 
Nevertheless, amplification does not always correlate vn&i gene overexpresdon. 
Then, in order to study the relationship between MLN gene amplification and 
expresdon, we have performed genomic DNA and RNA analyses of a panel of 
human breast cancer cell lines, including MCF7, TCM7D, BT^74, SKBR-3, 
MDA-MB-231, BT-20 and ZR-75-1, and the immortalized breast epithelial cell 
line HBL-100. MLN amplification and expression patterns were compared to 
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those of CrerbB-2 and of pS3, a gene located on the short arm of chromosome 17 
and frequentfy mutated or lost but neva- amplified in breast cardnoma (Baker, S.J. 
etaL, Science 24^:217-221 (1989)). Hybridization of Southern blots containing 
a JBomHI digest of genomic DNAs extracted from these cell lines showed that the 
c-er£B-2, MLN SO, 51 and 64 genes were amplified in some cdl lines^ whereas 
the MLN 62 and pS3 genes were not (Table IV). Moreover, in order to quantify 
the levd of amplification, dot blots contaiiung serial dihitions of ceD genomic 
\^pNAswerepeifi>nned. As summarized in Table IV, MLN 64 and c-€r&B-2 genes 
were found to be co-amplified in SK-BR-3 (8 and 16 cojues, respective) and 
BT-474 (16 and 32 copies, re^ectivdy). MLN SO gene was only amplified in 
|BT-474 (8 copies) and MLN 51 gene in SK-BR-3 (4 copies). Northern blots 
containiqg KNAs extracted from the same cell fines were hybridized to the MLN 
cDNApnoibes QE^g. 3). MLN 64 and 19 (c^r£B-2) genes were overexpressed in 
SK-BR.3 andBT-474, MLN 50 gene in BT-474 and MLN 51 gene m SK-BR-3. 
These results deariy showed that, in ceU lines, MLN 50, 51 and 64 overexpressibn 
were related to their gene amplification. Overexpression above basal levd was 
observed for MLN 62 in SK-BR-3 and BT-20, and for p53 in MCF7 and 
HBL^lOO, independently of gene amplification. 

An^Iification pattons observed in breast cancer cell lines suggested that 
MLN 5 0 (co-amplified with c-eriB-2, but not with MLN 62) and MLN 64 (co- 
ampfified with o-er£B-2 in two cdl lines, whereas MLN 51 was only in one cell 
line) should be located closest to c-er6B-2 than MLN 62 and 51, respectivdy. 
Thus; aoconfing to their diromosomal assignments and amplification patterns, the 
five locus firamewoik order cen-MLN 62-MLN 50-c-er6B-2-MLN 64-MLN 
51-td could be proposed (Hg. 2^)). 
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IHsaissioH 

In the present study, we rq)ort the idoitification of cDNAs by differential 
screening of a breast cancer MLN cDNA libraiy with two subtracted cDNA 
probes, representative of ntialignant (MLN) and nonmaKgnant (FA) breast tissues. 

The idoitified cDNAs correqtonded to ten distinct gfarp s esqiressed in 
MLNs, but not in nonnal lymph nodes or FAs. 75% of these cDNAs coneqxmded 
to known genes, namely the c-eriB-2, NCA, fibionectin and calcycGn genes, 

have been previously shown to be invdved in metast^ c-er&B- 
2 overesqiression has been demonstrated in 15-30% of breast cananomas and has 
been associated with shorter surviva], particuhuiy in patients with invaded 
nodes (Slamon. D.J. et aL, Science 244:707-712 (1989); Boig, A. et aL, 
Oncogene tf:137-143 (1991); ToDdcanen S. etoL^J.Om. OncoL ftl03-112 
(1992); lAaa&, HB. et aL, N. Engl J. Med 500:1260-1266 (1994)). NCA 
belongs to the cardnoembiyonic antigen (CEA) femily. CEA expression is 
elevated in 50-80% of patients with metastatic breast cancer and is used as a 
dreuhdngmailar to delect disease recurrence (Lopiina,C.e/ a/:, y. Clin. OncoL 
4:46-56 (198Q). A modulation of fibronecdn expresaon by ahemative slicing 
has been reported in malignant tumors (CamemoDa, B. et aL, J. Cell BioL 
108: 1 139-1 148 (1989); Kimphries^ MJ., Semin. Cancer BioL 4:293-199 (1993)). 
Cakycfin, amember of the SlOO Ca** binding protein femily, is a ceD cyde related 
protein and has been shown to be overexpressed in highly metastatic hnnan 
mdanoroa cdl lines (Weterman, M.A. et al.. Cancer Res. 52:1291-1296 (1992)). 
About half of the last 25% of identified cDNAs conesponded to two novel 
members of the SlOO and keratin protein fimilies, respectivdy. Finally, the 
remaimng differential clones (MLN 50. 51, 62 and 64) conesponded to cDNAs 
which did not bdong to any previously characterized gene or protdn femily. 

The four goies corre^ondiqg to these cDNAs were co-localized to the 
ql l-q21.3 r^on of the chromosome 17 long arm. Several gwies in^licated in 
breast cancer pr^gresaon have abeady been assigned to the same portion of this 
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diromosome, notably the oncogene c-erbB-l in ql2 (Fukushige, S.L et aL^ MoL 
Cell BioL 6:955-958 (1986)) and the recently cloned tumor suppressor gene 
BRCAl in q21 (HaB, IM. et al.. Science 250:1684-1689 (1990); Mild, Y. etoL, 
Science 266:66-71 (1994) and refs. therein). According to thdr chromosomal 
assignments, we mapped the four novel genes proxunal (MLN 62 and SO) and 
distal (MLN 64 and 5 1) to the c-er6B*2 gene, and, most probably, proximal to the 
BRCAl gene. 

In v/vo, the four MLN genes showed amplification in 10-20% of breast 
carcinomas. Moreovo; in breast cancor ceO lines, MLN 64 exhibited an 
amplification pattern identical to that of c-^6B-2 showing a clear amplification 
in BT-474 and SK-BR-3. However, MLN 50 and 51 gene amplification was 
restricted to BT-474 and SK-BR-3, respective^, and, aiy cdl lines diowed MLN 
62 ampfification. Altogether^ these results support the concqrt that c-erAB-2 
amplicon nature and size are variable firom one malignant cell line to another 
(Muleris, M. etoL, Genes Chrom. Cancer 70:160-170 (1994)), exemplifying the 
breast cancer heterogendty (Lomi, U. et al., IntL J. Cancer 55:40-45 (1994) and 
refs. therein). Finally, in breast cancer cell Imes, MLN 50, 51 and 64 g^ 
overexpression was corrdated with gene amplification. 

It is assumed that DNA amplification plays a crucial role in tumor 
progresaon by aUowing cancer cells to upregulate numerous genes (Kallionienu, 
A e/oi, Proc NatL Acad ScL USA 97:2156-2160 (1994); Lonn, U. et al., IntL 
J. Cancer 55:40-45 (1994)). Frequency of gene ampGficadon as wdl as gene 
copy number increase during breast canc^ progresdon, notably in patioits wiio 
do not respond to treatment, suggesting that overexpression of the amplified 
target genes confers a sdective advantage to malignant cells (Schimke, R.T., J. 
BioL ChenL 253:5989-5992 (1988); Lonn. U. et oL, IntL J. Cancer 58A0^5 
(1994); Guan, X.Y. etaL, NaL Genet. &155-161 (1994)). Recently, amplified 
loci, distinct from those of currently known oncogenes, have bem mapped, u^g 
conqrarative genonuc hybridization (Kallioiuemi, A et a/., Proc. NatL Acad Scl 
USA 97:2156-2160 (1994); Muleris. M. etoL, Genes Chrom, Cancer 70:160-170 
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(1994)X su^esdng the presence of unknown genes whose expression contributes 
to breast cancer. As we report here, the use of differratial screwing could be an 
efficient methodology for the identification of sudi unknown genes^ since it allows 
for the direct cloning of amplified and ova-expressed genes. Although 
anqdificadon involves large regions of diromosomal DNA, it is known to target 
cmcogeBes (Schwab, M. & Amier, L., Genes Chrom. Cancer 7:181-193 (1990)). 
Hie oondation between amplification and overexpresaon is necessary to identify 
the targeted gene. Tfaus» within the 17ql2 amplicon, c-erbB-l is often co- 
amplified with o^bA but c-erbA overe9q[)ression was never obsoved (van de 
Vijver, M. ei al.^ Mol Cell BioL 7:2019-2023 (1987)). A similar finding was 
observed witfiin the 1 lql3 amplicon vAim the cyclinD/PRADl gene is linked to 
fii/-2 and hsiA two fibroblast growth &ctor related gmes and only PRADl is 
overexpresscd in the cardnomas (Lammie, G.A. et id.. Oncogene 6:439-444 
(1991)). In dus context, the fict ftat fte four novel genes identified in the present 
study are not only amplified but also overexpressed, suggests that they may 
contribute to the genesis and/or the progression of breast tumors. 
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Table I: Clinical and Histological Characteristics of the Breast 

Carcinomas 



Patient 


Age(yrs.) 


Tumor size (cm) 


Histological 
grade 


Nambo* of 

involved 
lymph nodes 


A 


40 


2x1.5x1.5 


m 


1/15 


B 


35 


2.5x1.8x1.6 


n 


5/14 


C 


50 


2.7x2.0x1.5 


n 


17/19 


D 


40 


3.5x3.0x2.0 
2.0x1.5x2.0 


m 


2/10 
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Exaniple2 

CARTly a Gene Expressed in Human Breast Carcinoma, Encodes a Novel 
Member of the Tumor Necrosis Factor Receptor-Associated Protdn 

Family 

InirodueiioH 

Human CARTl cDNA coire^nds to the MLN 62 cDNA done discussed 
above m Example 1. The done was identified througji a differential screening 
performed by using two subtractive probes, respectivdy represratative of 
metastadc and nonmalignant breast tissues and was mapped on chromosome 17, 
at the qll-<il2 locus, a locus whidi includes the oncogoie oerbB-2 whose 
overexpresaon is corrdated with a shorter overall and disease firee survival for 
breast cancer patients (Slamon, DJ. et aL, Science 235:177-182 (1987); Muss, 
ILB. et al., N. Engl J. Med 330: 1260-1266 (1994)). 

Li this example, we investigated the CARTl gene expression in a panel of 
normal and malignant human tissues and characterized the CARTl cDNA protdn 
and gene organization. CARTl was spedfically expressed in epithelial breast 
cancer cdls. The amino add sequmce of CARTl reveals structural domains 
similar to those present in TNF receptor associated protdns, suggesting that 
CARTl is implicated in dgnal transduction for TNF-related cytokines. 

Materials and Methods 
Tissues CoUecdon 

Dq>ending on subsequent analysis^ tissues were dth^ immediatdy fi-ozoi 
in liquid lutrogen (RNA extraction), or fixed in formaldehyde and parafiBn 
embedded (in situ hybridization). Frozen tissues were stored at -80'*C whereas 
para£5n-embedded tissues were stored at 4*'C. 
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The mean age of the 39 patioits included in the present study was 
55 years. The main diaracteristics of the breast cardnomas were as followed: 
SBR grade I (13%). grade H (38%), grade m (49%); estradiol receptor positive 
(25%), n^ative (75%); lymph nodes without invaaon (39%X vvith invaaon 
(61%). 

RNA bctadom andAnafyns 

Total RNA prqnred by a angle-stq) method uai^g pinniftinimn 
isotluocyanate (Chomczynsld, P. & SacchI, N., Anal. Biodiem, /tf2: 156-159 
(1987)) was fractionated by agarose gd electrq)horens (1%) m the presoice of 
fixmaUdhyde. Afler the txansfer, RNA was immd)ilized by heating (12 hr, SO^'C). 
Rltera (JEt^banA N; Amersham Corp.) were addified (10 nan, 5% CH3COOH) and 
stained (10 nun, 0.004% methj^ene bhie, 0.5M CH,COONa, pH 5.0) prior to 
hybri^zaticm. 

A CARTl probe corresponding to the full-length human cDNA 
(nudeottdes 1 to 2004), doned into pBluesmpt n SK vector (Stratagene) was 
'^P-labded uang random priming (-10* cpm/ng DNA) (Fdnberg, AJP. & Vo 
Vogdstdn, B., AimL Biochem. 132:6 (1983)). Filters were prdqrbridized for 2 
hrsat42'C in 50% formamide, 5x SSC, 0.1% SDS, 0.5% PVP, 0.5% Ficoll, 50 
mM sodhmi pyrophosphate, 1% ^ydne and 500 jig/ml ssDNA Hybridization 
was for 18 hrs under stringent conditions (50% formamide, 5x SSC, 0.1% SDS, 
0.1%PVP,0.1%Fkoll,20mMsodiumpyn9hoq>hat^ 10% dextran sulfite, 100 
(tg/ml ssDNA; 42"Q. Rlters were washed for 30 rain in 2x SSC, 0.1% SDS at 
room temperature, followed by 30 nun m O.lx SSC, 0.1% SDS at 55'C. 

InSkuHybriOzadon 

In situ hybridization was poformed using a '*S-labded antisense RNA 
probe (5x10* cpm/^g), obtained after in vitro transcription of a B^fll fragment 
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(imdeotides 279-1882) of the human CARTl cDNA, Formaldeltyde-fixed 
paraffin-onbedded tissue sections (6 \im thick) were deparafiBned in LMR, 
rdiydrated and digested with proteinase K (1 jig/ml; 30 min, 37"*C). 
Hybridization was for 18 hrs, followed by RNase treatment (20 {xg/mly 30 min, 
37°C) and stringently washed twice (2x SSC, 50% formamide; 60^C, 2 hrs). 
Autoradiography was for 2 to 4 weeks using NTB2 emulsion (Kodak). After 
egq)osuie, tfie slides were devdoped and counterstained uang toluidine blue. ^S- 
labded sense transcript from CARTl was tested in parallel as a negative control. 

CARTt Genomic DNA Oomng 

I^pgofbuman genomic DNA was partially (figested with &iir5A. After 
size sdection on a 10-30% suCTose gradioit, inserts (16-20 kb) were subdoned 
at the BamSi replacement ^e in lambda EMBL 301 (Lathe, K. et al.. Gene 
57:193-201 (1987)). 2.5x10* recombinant clones were obtained and the hT>rary 
was amplified once. One million pfo were analyzed for the presence of genomic 
CARTl DNA, uang the ftill-length CARTl cDNA probe. Thirty clones gave a 
poativeagnaL After a second screemng, four ofthese clones wa^e subdoned into 
pBluescript n SK- vector (Stratagene), sequenced and positioned with respect to 
the CARTl cDNA sequence. 

Sequencing Reactions 

CART gDNA dones and ^omic subdones prq)ared as described (Zhou, 
C. et aL, Biotedmigues «:172-173 (1990)) were fiirtfier purified with RNaseA 
treatment (10 jig/ml; 30 min, 37*'C) followed by PEG/NaCl predpitation (0.57 
vol; 20%, 2 M) and ethanol wasiung. Vacuum dried pellets were resuspended at 
200ng^)ilmTE. Double-strandedDNAtemplates were then sequmced with Taq 
pobymerase, uang dAier pBluescript universal primers and/or internal primers^ and 
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dye-labeled dNTPs for detection on an ^plied Biosystems 373A automated 
sequencer 

Computer Anafysis 

Sequence analysis wm perfonned using the GCG sequence analysis 
package (WisconsinPackag^veracm 8. Genetic Cton^^^ TheCARTl 
cDNA sequence and its deduced putative protdn were used to seaidi the 
complete combined GenBank/EMBL databases and the complete SwissiProt 
database respectively, with BLAST (Altschul, S.F. et oL, J. MoL BioL 2/5:403- 
410 (1990)) and FastA (Pearson, WJt & Lipman, D J., Proc. NatL Acad. Sci. 
USA 55:2444-2448 (1988)) programs. The RING finger motif and consensus 
sequences of CARTl protein were finther identified by the Motife program in the 
PROSITE dictionary (release 12). The sequence alignments were obtained 
automatically by uang tiie program PileUp (Feng, DJ . & Doolittie, ILF., J. MoL 
EvoL 25:351-360 (1987)). 

Results 

Expression of Ae CARTl Gene 

Uang Northern blot analysis, we have studied CARTl gene expres^on m 
bemgn (16 fibxoadenomas) and mafignam (39 cardnomas and 5 metastatic axill^ 
lymph nodes) human breast tissues. Hybridization witii a CARTl cDNA probe 
gave a po^tive agnal corresponding to CARTl transcripts witii an apparent 
molecular wdght of 2 Id>, in4 carcinomas and 2 metastases (Fig. 4, lanes 7, 11, 
13 and 17, and data not shown). The fibroadoiomas did not show CARTl 
cxpresaonabovetiiehasal level (Rg. 4, lanes 1-6). No CARTl transcripts wctc 
obs^ved m normal human axillaty lymph node, skin, hing, stomach, colon, liver 
kidney and placenta (data not shown). 
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In situ hybridization, using an antisense CARTl RNA probe, was 
perfonned on primary breast cardnomas and axillary lynq)h node metastases. 
CARTI was expressed in malignant epithdial cells (Fig. 5(C)) and inva^ve 
carcinomas (Rg. 5(B)), whereas tumoral stromal cells were negative. CARTl 
transcripts were homogeneously distributed among the positive areas. Normal 
q>itheiial cells did not express the CARTl gene, evm when located at the 
proxunity of invasive carcinomatous areas Qng. 5(A) and data not shown). A 
y similar pattern of CARTl gene expression was observed in metastatic anllary 

\ lymph nodes from breast canc«* patients with e9q>ression linuted to cancer cells 
yhere^ noiunvolved lymph node areas were negative OFig. 5(D) and data not 

^ shown). 

Determination of Human CARTl cDNA and Putative Pfotdn 
Sequences 

The complete CARTl cDNA sequence has been established from three 
independent cDNA clones. Both sense and antisense strands have be« 
sequenced. The longest cDNA done contained 20(M bp, a azeconastentwi 
previously observed 2 Id> transcript suggesting that this cDNA corresponded to 
a fiiU-length CARTl cDNA (Rg. 6) (SEQ ID NO:l). The first ATG codon (at 
nudeotide position 85) had the most favorable context for initiation of translation 
(Ko2ak, M., Nucl Acids Res. 75:8125-8149 (1987)), and a dassical AATAAA 
poIy(A) addition agnal sequence (Wahle, E. & Kdler, W., Annu. Rev. Biochenu 
67:419^ (1992)) was located 18 bp upstream of the poly(A) stretch. Thus, the 
open reacfing frame was predicted to encode a 470-reMdue protein (Rg. 6) (SEQ 
ID N0:2), with a molecular v^rdght of 53 KD and a pHi of 8. The putative protein 
showed several consensus sequences, and notably two potential nuclear 
localization signals (MLS), a monopartite KPKRR (residues 11-15 of Rg. 6, SEQ 
IDNO:2)(Danfc C.V. ALee. W.M.F., J[ Biol Chem. 2tf-/: 18019-18023 (1989)) 
and a bipartite RR-X,i-KKRLK (readues 123-140 of Fig. 6. SEQ ID NO:2) 
(DingwaU, C. & Laskey, R.A, Trends Biochem. Sci. 76:478-480 (1991)). The 
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mdecule also contained potential sites (reviewed in, Kemp, B.E. & Pearson, ILB., 
Trends Biochem. Set 75:342-346 (1990)) spedfic of N-gJycoqdation (NGS, 
residues 3SS-3S7 of Kg. 6, SEQ ID N0:2), pho^horylation by casein kinase I 
(EELS, readues 300-303; SVGS, residues 303-306; ECFS, residues 331-334; all 
of Fig. 6, SEQ ID N0:2) and casdn kinase n (SEE, re^dues 86-88; SRRD, 
residues 122-125; SGE, leadues 149-lSl; SH^ readues 1SS-1S7; TSE. residues 
185-187; TKE, residues 199-201; SGE, readues 357-359; SLLD, residues 389- 
392; SLDE, residues 426-429; SHQD, residues 441-444; aU of Fig. 6, SEQ ID 
NO:2X proline-dependent phoqihoiylation ^SPA, readues 333-336 of Fig. 6, 
SEQ ID NO:2) and cAMP-dq>endent phosphoi>dati(Mi (RRVT, readues 384-387 
of Fig. 6, SEQ ID NO:62). Moreover, two cystdn-iich (Cfldi) i^ons wac 
identified, one located at the N-terminal part of the protein (residues 18-57) and 
the other at the core of the molecule (readues 83-282). Finally, the C-tenninal 
paitoftiie CARTl protein corresponded to the recently described TRAP domain 
Clothe, M. et aL, Cett 79:681-692 (1994)) (Fig. 6). 

CARTl Coutauis an Unusual N-4erminal RING J^gerMot^ 

TheN-terminal C-ridi structure of the putative CARTl protdn contained 
a CXiCXuCX,HXjCX^„GXp (C3HC3D) motif (residues 18-57 of Fig 6, 
SEQ ID NO:2) remmiscent of the C3HC4 consensus sequence (Freemont, P.S. 
etaL, CeU «:483-484 (1991); Fig. 7). This sequence, located either at the N- or 
at the C-termmal part of proteins, could potentiaDyg^ rise to two anc fingers 
and has been named die RING finger motif (Freemont, P.S., Am. N.Y. Acad ScL 
684:174-192 (1993) and re&. dierem). The proteins v^ch share such a structure 
often esdiibit DNA or UNA buiding properties, and have been reported to be 
implicated during development such as DG17 (DriscoU, DJ^ & William^ J.G., 
Mol. Cett. BioL 7:4482-4489 (1987)) and SU(z)2 (Van Lohuizen, M. et al., 
Natiffe J5J:353-355 (1991)X genettanscription such as RPT-1 (Patarca, R. et al. , 
Proa NaO. Acad ScL USA «5:2733-2737 (1988)X SS-A/Ro (Chan, E.K.L. etaL, 
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1 ain. JmesL 87:6^76 (1991)), XNF7 (Reddy, B, eial. Dev. BioL 148:107A 16 
(1991)) and RINGl (Lovering, KeiaL, Proc NaiL Acad Sci USA 90:21 12-21 16 
(1993)), DNA repair such as RAD-18 (Jones, J.S. et oL, Nucl Acids Res. 
7(^:7119-7131 (1988)), cell transformation such asMEL-18 (Tagawa» M. etaL, 
J. BioL Chem. 255:20021-20026 (1990); Goebl, M.G., Cell 66:623 (1991)), 
tumor suppres^on sudi as BRCAl ^fild, Y. etaL, Science 266:66-71 (1994)), 
or agnal transducdon sudi as CD40-binding protein (CD40-bp) (Eiu, RM. ei oL, 
J. BioL Chem. 269:30069-30072 (1994)) and TRAF2 (Rothe, M. et al.. Cell 
79:681-692(1994)). The distribution ofC- and H-residues is highly consa^red 
in all these RING fingers (Fig. 7). However, CARTl contained an aspartic add 
(D-) readue instead of the last C-readue of the C3HC4 motif (Fig. 7). In order 
to confirm the presence of tius IXre^due, and since D-codon sequence lead to an 
AvdXL restriction ate (Fig. 8(A)), an i^vdll digestion was performed on the M- 
length CARTl cDNA. Gd electrophoresis showed the presence of four bands 
(253, 428, 53 1 and 792 bp, respectively), a pattern consistent with the presence 
of a D-codon (Fig. 8(B)). However, since the CARTl cDNA was cloned firom 
a cDNA library established using malignant tissues, we could not exclude the 
possibility that the D-re^due resulted firom an alt^tion occurring during 
cardnogoieas (Bishop, J.M., Cell 64:235-34% (1991)). Thus, in order to identify 
the phy^ological readue, we sequenced CARTl DNA from a normal leukocyte 
genomic library {see Materials and Methods). This analy^s confirmed the 
presence of a D-residue, and consequently the C3HC3D motif. Data bank library 
analysHS did not reveal aiiy other protdn sharing an identical RING finger motif. 

IdeittificadonandCharacienzaAon of a Novel C-fich MiOif, the CART 
Motif 

The second C-rich region expanded fiom re^dues 83 to 282 and 
constituted ahnost half of the protdn (Tig. 6) (SEQ ID N0:2). It contained 23 C- 
and 12 H-reddues, corresponding to 96% and 67% of the remaimng C- and H- 
residues, respectively. A carefiil examination of spacing of these C/H residues 
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allowed the detection of an ordonnance giving rise to three 
H3QCX.CX,CXii.^C3QqX«pC„ (HC3HC3) repeats. The most N-tenninal 
of them (residues 101-154) contained the potential bipartite MLS (Figs. 6 and 9). 
Homolo^es brtween these rqjeats ware not restricted to the C/H readues and to 
the spacer sizes. Alignment of the three CARTl HC3HC3 motifi showed around 
50% similarity and 30% idoitity with each other (Fig. 9). 

Homology seardies in the protdn database revealed the presence of one 
copy of an analogous motif (residues 193-250) in the DictyosteUum discoideum 
DG 17 protein (Fig. 9) (SEQ ID NO:28) (Driscoll, DM. & WilBams, J.G., MoL 
CeU. Biol 7:4482-4489 (1987)). and of two copies in the human CD40Axp (Kg. 
9) (residues 134-189 and 190-248, SEQ ID NOS:24 and 25, respectively) (Hu, 
HM. et aL, J. BioL Chem. 269:30069-30(n2 (1994)) and in the mouse TRAF2 
^g. 9) (residues 124-176 and 177-238, SEQ ID NOS:26 and 27) (Rothe, M. 
etaL, CeU 79:681-692 (1994)). It should be noted that the sequences of the two 
Ntennnal CARTl HC3HC3 moti& were most similar to those of the N-tenninal 
motiS of CD4(M>p (50% and 40%, req>ectivefy) and of TRAF2 (52% and 46%, 
respectively). Hie Ctominal CARTl HC3HC3 motif however was most similar 
to the Ctaminal modfi of CD40-bp (58%) and of TRAF2 (55%), and to that of 
DG17 (51%) (Fig. 9). From these comparisons, the 
iD^CX^CXa.,^„CX,CX«CX„ consensus sequoice was proposed for 
this novel motif that we named the CART motif for "C-rich motif Associated to 
RING and TRAF domains" (see, infra) (Fig. 9). 

CASTl Qmtmia a €>tennuua TRAF Domain 

The TRAF domain, recently identified in the TNF recqitor-assodated 
factors 1 CTRAFl) and 2 (TRAF2), is involved m TNF signal tnuisduction 
pathway. TRAF domains encompass the 230 C-teeminal residues of these proteins 
and share 53% identity (Rodi^ M. et al., CeU 7$:681-692 (1994)). The TRAF 
motif was also rqwrted in the CD404>p which associates with the cytoplasmic tail 
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of CD40, another member of the TNF receptor family ^iu, H.M. etaL^J. Biol 
ChenL 269:30069-30072 (1994)). The C-terminal part of CARTl (residues 267- 
470) showed two degrees of homology with the TRAP domain. Thus, residues 
267 to 307 showed a weak homology (12-23% identity). From structural 
predicdons, tfiisN-4enninal pait of CARTl TRAP domain is supposed to give rise 
to an a^ helix (Oiou, P.Y. Sl Fa^nan, GJ)., Amm. Rev. Biochem. ^7:251-276 
(1978)). Sudi a structure, already proposed for the corresponding r^ons of 
TRAFl, TRAF2 and CI>40-bp is supposed to be involved in protein/protein 
interactions (Rotfae, M,etaL, Cell 7^:681-692 (1994); Hu, HM. et oL^ J. BioL 
ChenL 26930069-30072 (1994)). The C-terminal part of CARTl TRAP domam 
(residues 308-470) showed high d^ree of ^nularity and identity with fhe 
correspondmg part of TRAFl (60% and 42%X TRAF2 (69% and 47%) and 
OMO-bp (62% and 43%X thus defining a "restricted TRAF domain" (Fig. 10). 
FinaUy, ance DG17 already contained a N-termmal RING finger and a CART 
motiC we looked for the presMce of a restricted TRAF dommn in its C-terminal 
part. We observed 55% amilarity and 30% identity between the last 150 re^dues 
of CARTl and DG17 (data not shown). However, the protozoan DG17 protdn 
showed numerous mismatdies with the restricted TRAF cons^isus motif d^ed 
from human and mouse protons ^g. 10), suggesting that DG17 contains a 
primitive TRAF domain. 

CARTl Gene Organization 

Twoindqiendent clones have been selected firom a sareening of a human 
leukocyte graomic library using fhe fiill-Iength CARTl cDNA probe. These 
dones contained 3 and 3.2 kb BamHl fi-agments vAich have been subcloned and 
partially sequenced in order to map splicing sites. The human CARTl gene was 
found to be spVx into 7 cxons (Fig. 11 and Table V (exon/intron Nos. 1-6 
corresponding to SEQ ID NOS:S2-57, respectively). Comparison of the 
intron/exon boundaries showed that each corresponded to a canonical splice 
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consmsus sequence (Breathnach, R. & Chambon, P., Armu. Rev. Biochem. 
50:349-383 (1981)). The total length of the CARTl gene is approximately 5.5 kb 
^g. 1 1). Anafyas of the genomic structure of the RING finger domain revealed 
that it is encoded by two exons separated by the presence of an intronic sequmce 
located between nudeotides 226-227 (Rg. 4). I1iu% the C3HC2 and the CD parts 
oftiieC3HC3Dim3tifare encoded by exons land 2, req)ectivdy^^^^ The 
fliree CART motife ware encoded by three separate exons of 1 6 1 (exon 4) (SEQ 
ID NO:55), 161 (exon 5) (SEQ ID NO:56) and 156 (exon 6) (SEQ ID NO:57) 
Iq), le^jectively (Fig, 11 and Table V). In addition to their samilar aze, the three 
excws exhibited about 40% identity with each other, suggesting they have arisen 
by dqificadonofan ancestral exon. Finally, the a-helix and the restricted TRAF 
domain were encoded by exon 7 vMdh also encoded for the 3' untranslated 
r^on. 

CARTl Praidn Subcellular Localization - CARTl subcellular 
localization was performed on paraflBn-embedded sections fi-om a human invasive 
breast cardnoma using a rabbit polydonal antibody. The antibody spedficity was 
established by Western blot analysis of CARTl recombinant proton (data not 
shown). Consistent with our findmgs using m situ hybridization, CARTl 
immunoperoxidase staining (brown staining) was observed in malignant epthdial 
cdls. MoreovCT, CARTl protdn appeared to be located in the nudeus showing 
that almost one of the CARTl nudear locahzation signals was fimctional. The 
intensity of staining was variable fitmi one cdl to anot^^ 
oftfaesectioa 

IXsoission 

We characterized a cDNA and coneq>onding putative protein encoded by 
a novd gene that we call the CARTl gene Cdentified as MLN 62 in Example 1) 
by screening a breast cancer metastatic lymph node cDNA Ubrary. CARTl was 
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overesqiressed in 10% of primary breast cardnomas and 50% of metastatic axillary 
lymph nodes, i^oreas the corresponding nonmalignant tissues did not. CARTl 
transcripts were spedfically detected in malignant q>ithdial cells and 
homogeneous^ distributed throughout the carcinomatous areas. No CARTl 
expression was observed in a panel of normal human tissues including sldn, hmg, 
stomach, colon, fiver, kidney and placenta. TUs expressdpn pattern, restricted to 
some malignant tissues, suggests that CARTl is invoh^ed in processes leading to 
^ the formation and/or progresdon of primary carcinomas and metastases. The 
putative CARTl protein sequence, deduced fix>m the cDNA open reading frame, 
esdiibhedsevend structural domains. The CARTl N-taminal part contained a C- 
. rich domain characterized by the presence of a RING finger (Freemont, P.S., Ann. 
KY. Acad Ski. 684:174-192 (1993)). The RING finger protein &mily presentiy 
comprises more than 70 niembers involved in the regulation of cdl prolifeiation 
and <Kfferentiation (reviewed ni, Freemont, P.S., Ann. N.Y. Acad Sci. 684:174- 
192 (1993)). Interestingly, one of the recently identified members of the fenuly 
is the tumor suppressor gene BRCAl, responsible for about 50% of inherited 
breast cancCTS (Mid, Y. e/ aL, Science 266:66-71 (1994)). RING finger motif is 
assumed to fold into two anc fingers and to be involved in protein/nucleic add 
inteaaction(s) (Schwabe, J.WJL & Khig, A., Nature Stntc. Biol 7:345-349 (1994) 
and re6. th^ein). In CARTl RING finger, the last C-readue is substituted by a 
D-residue giving rise to a C3HC3D motif instead of the usual C3HC4 motif 
Since aspartic add has already been described as a potential zinc coordinating 
residue (Vallee^ Bi. & Auld, D.S., Biochem. 2P:5647.5659 (1990)), we assume 
that the C3HC3D motif may effidmtiy Imid metal atoms through the zinc finger 
structure. Consistent with tiiis hypothesis, aspartic add has alreacfy been reported 
to be fimctional in another type of zinc finger motif, the LIM domain (S^chez- 
Garda, I. & Rabbits, TJI., Trends Genet 9:315-320 (1994) and reft, therdn). 

CARTl RING fingo- is encoded by two exons coding for the C3HC2 and 
CD part of the C3HC3D motif, respectively, a genomic organization dightiy 
different fit>m that previously described for the consensus MEI^l 8 RING finger 
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which results from two exons encoding the C3H and C4 putative zinc finger, 
respectively (Asano, KetaL, DNA Sequence 3-369-3n (1993)). 

CARTl also contained an original C-rich r^on, located more centrally 
within the protein and composed of three r^eats of an HC3HC3 motif 
corresponding to a novel protdn agnature and that we deagnated the CART 
moti£ These time repeats were encoded by distinct excHish(»nolo^us with eadi 
other, sug^stingtiiat they derived fitMn an ancestral exon. CART motifi were 
only found^ in variable copy numbers^, in three RING finger protons, the human 
CD4(M>p (two co]Mes), the mouse TRAF2 (two copies) and tiie Dictycstetttm 
discoideumBGn protein (one copy) (Ha. HM. etaL,J. Biol Chem. 26930069- 
30072 (1994); Rothe, M. et ed., CeB 78:6Sl-692 (1994); Drisooll, DM. & 
WilBams, J.G.,MoL CeU. BioL 7:4482-4489 (1987)). The conesponding C-rich 
re^mis of CD404>p, TRAF2 and DG17 have been previous^ reported to be 
partially anaqgBd in pattern resemMing dther the CHC3H2 "B box" motif or the 
C2H2Abtqpttr/Eini» transcription fector m A motif (Freonont, P.S., Ann. N.Y. 
Acad ScL <SW:174-192 (1993); Hu. RM. etaL,J. Biol Chem. 269:30069-30072 
(1994); Rotiie, M. et aL, CeU 75:681-692 (1994); DriscoU. DJkl. & Williams, 
J.G., MoL CeU. BioL 7:4482-4489 (1987)). The CART moti^ as defined in tiie 
present study, encompasses afanost the totality of the C-ridi r^on obsoved in 
CARTl, CD40-bp, TRAF2 and DG17. The fimction of tiie CART domam 
remamstobedetenmned. Prefiminary protdn stupes (CJL, unpublished resuhs) 
incficate tiiat the correct fi>lding of the CART motif is depoicfing on the presence 
cStooc, aqjpoitiqg tiie faypotiiesis tiiat CART corre^nds to a novd anc binding 
motif pfesumaUy invoked in nuddc add binding (Schwabe, J.W JL & Khig, A., 
Nature Suva BioL 7:345-349 (1994); Schmiedeskamp. M. & Klevit, R.E., Curr. 
Opin. Struc. Biol 4J2Z-3S (1994)). 

The C4erminal part of CARTl corre^onded to a TRAP domain 
previously identified m TRAFl, TRAF2 and CD40-bp. This motif is involved in 
protdn/protdn intwaction and TRAF2 and CD40-bp have been reported to 
spedficaUy interact witii tiie cytoplasmic domain of two members of tiie TNF- 
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recqrtorfinuly, TNF-R2 and CD40, respectively (Rothe, M etaL, Cell 78:6iU 
692 (1994); Hu, UM, et al., J. Biol Chem. 2<?P:30069-30072 (1994)). The 
TRAP domain is composed of two structural domains, a N-terminally located 
domain which corresponds to a weakly consented alpha helix and a C-terminally 
located domain whidi is highly conserved and corresponds to what we called the 
"restricted TRAF domain," since h includes only part of the previously described 
TRAP domains (Rothe, M. et al.. Cell 75:681-692 (1994); Hu, HM. et ah, J. 
BioL Chen 2dj^.30069-30072 (1994)). Both structural motifs were encoded by 
the same exon of the CARTl gene. Homology was also observed with the C-* 
tmninal part of the protozoan DG17 protein wUdi, althou^ less conserved, 
could be con^dered as a TRAF domain. 

Urns, CARTl Glared a protdn oiganization similar to that of the human 
CD4(Mip, the mouse TRAF2 and protozoan DG17, mcluAng aN-terminal RING 
finger, one to three central CART motifs and a C-temunal TRAF domain 
^ig. 12). These results suggest that these structurally rdated proteins belong to 
the same protdn fimily and may exhibit analogous function. DG17 is expressed 
during Dictyostelium cSscoideum aggregation which occurs und^ stress 
conditions in order to permit cell sundval through a dififo^tiated multicellular 
organism. The predse fimction of DG17 function remams unknown (DriscoD, 
DAI. & Williams, J.G., Afoi Cell Biol. 7:4482-4489 (1987)). However, botfi 
CCWO-bp and TRAF2 have been previously shown to be invoh^ed in TNF-related 
cytokine signal transduction (Hu, H.M. et al., J, BioL Chem. 25P:30069-30072 
(1994); Rotiie, M. et oL, CeU 79:681*692 (1994)). In contrast to growth &ctor 
receptors, (^kine receptors generally do not contain kbiase activity in thdr 
cytoplasmic regjon, and their signal transduction mechanisms remain ehisive 
(reviewed in, Taga, T. & Kishimoto, T., FASEB J. 6:3387-3396 (1993)). To 
date,theTNFandTNFrecqytor&nuliescont^8and 12 members, respectively. 
The lack of sequence homology among TNF-receptor qtoplasmic domains, 
required for dgnal transduction, suggests the existence of spedfic ^gnaling 
pathway for each receptor (reviewed in. Smith, C.A. et a£. Cell 65:959-962 
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(1994)). Recent^, it has been proposed that agnal transduction through CD40 
and TNF-R2 involved the interaction of their cytoplasmic donuun with two 
cytophsmic proteins, CD4(M>p and TRAF2, respectively ^th^ l/LetaL^ Cell 
75:681-692 (1994); Hu, H.M. etoL, 1 BioL Chem. 2^9:30069-30072 (1994)). 
Thus» CD40-bp and TRAF2 could be latent cytoplasmic transcription fectors, 
which would be translocated to the nucleus under receptor activation by thdr 
respective ligands. A similar systan has already been proposed for the proton 
family of agnal transducers and activators of transcription (STAT) involved in 
gene activation pathways triggo-ed by interferons (Darnell, J.R et oL, Science 
2<W: 1415-1421 (1994)). This system implies a direct signal transduction pathw^ 
through STAT migration &om cytoplasm to nucleus, presumabty triggered by 
STAT phosphorylation following receptor acdvation ^e, IN. ei aL^ Trends 
Biochem. Sou 19332^711 (1994); Darnell, Ji. et aL, Science 2d^:1415-1421 
(1994)). From all these observations, it is tempting to speculate that CARTl, 
ixdiich not only shares a structural arrangement of RING, CART and TRAP 
domains identical to that observed in two TNF recqitor assodated proteins, but 
also exhibits putative NLS and phosphorylation sites, may exert amilar function 
for TNF-rdated cytokine ^gnal transduction. 

TNF ligand fiunily members have been shown to induce pldotropic 
Uolo^cal effects, induding cdl differentiation, proliferation, activation or death, 
all processes invoked during cardnogene^ and tumor progression (Smith, C. A 
etaL, Cell 65:959-962 (1994), and refs. therein). In breast carcinomas, p55 and 
p75 TNF recqitors have been shown to be expressed in malignant tissues, and a 
dramatic increase of the secretion of their corresponding TNFa ligand has been 
assodated with metastatic step of the disease (Pusztai, L. et aL, Brit. J. Cemcer 
70:289-292 (1994), and refs. therein). Our observation of CARTl ov^expression 
in breast cardnomas suggests that, CARTl may be involved m signal transduction 
paAwqr eithar involving p55/p75 or another member of the TNF-receptor &mily. 
The nature of TNF receptor as well as the nature of protein(s) wMdi may interact 
with CARTl are now under characterisation. 
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Table V 

Exon/lMron (hgmuzation efUie CARTl Gene 



10 





EXON 


5' splice donor 


3' splice accq>lor 


INTRON 




sat 
(bp) 




size j 


I 


-500 


CCTCAGgtgctg.. 


..tatcagTGAAGKS 


1 


-2100 


2 


52 


GCCAAGglgaig.. 


..ccccagATCEAC 


2 


581 


3 


105 


CTACAGgtgagg.. 


..caccagGGCCAC 


3 


69 


4 


161. 


TATGAGglgggt.. 


..ttccag AGCXTkT 


4 


83 


5 


161 


ATCCAOglgagg.. 


..ccccagAGCCAC 


5 


87 


6 


155 


CACAGGg^ga.. 


..caacagTGCCCT 


6 


150 


7 


1140 











EiQQ sequeoces are indicated in capital letters, and intron sequences in gmall letters. 



wo 97/06256 



PCT/US96/12500 



-86- 

Exan^leS 

Lasp-l (MLN 50)^ Encodes the First Member of a New Protein Family 
Characterized by the Association of LIM and SH3 Domains 

Introduction 

In Example 1 above^ 1^ desoibe the isolation of MLN 50 (L^^ 
from a breast cancer derived metastatic lymph node cDNA libraiy by differential 
hybridization uang malignant (metastatic lymph node) versus nonmalignant 
(fibroadenoma and normal ^mph node) breast tissue. Chromosomal mqyping 
allowed us to map the Lasp-1 genetotfaeql2-q21 region of the chromosome 17 
long arm. TUs region is known to be altered in 20 to 30% of tmast cancers 
leading to the amplification of the proto-oncogene QrerV&-2 (Fukushige, S.I. et 
aL^McL Cett. BioL 5:955-958 (1986); Slamon, DJ. etaL, Science 244:707-712 
(1989)). In breast cancer cell lines, we found that Lasp-1 SNA ovoexpres^on 
was correlated with its gene amplification and to oerbB-2 
amplification/overexpression suggesting that Lasp-1 and c-er*B-2 bdong to the 
sameamplicon. In the present example, we determined the fi-equenc^ of Lasp-1 
overexpresion in human breast cancer and characterized the encoded protdn. 

Materials and Methods 

Tissue and CeO Cultures 

Surgical ^>ec]mens obtained at the Hdpltaux Universitaires de Strasbourg, 
were fiozen in liquid nitrogen for RNA extraction. Adjacmt sections were fixed 
in 10% buffered fiirmalin and paraflBn onbedded fi>r histological exammation. 

The cell fines (SK-BR-S. BT.474, MCF-7) are available from flie 
American Type Cdture Collection (ATCC,Roclcville,MD). Cells were routinely 
maintamed m our laboratory and cultured at confluenqr in Dulbecco's modified 
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Ea^e*s medium supplraiented with 10 % fetal calf serum (SK-Br-3) and with 10 
Mg/ml of insulin (MCF-7), and in RPMI supplemented with 10% fetal calf smmi 
and 10 figfxcl of insulin (BT-474). 

JRNA Prepara&an and Analysis 

Sarffcsl spGdmeos were homogenized in the guanidinium isothiocyanate 
buffer and purified by centrifiigation through ceshmi chloride cusUon 
(Ouigwin, JM. etoL, Biochenu /«:S2-94 (1979)). RNAs fiom cultured cell Imes 
wefe.extiac^ uangthe singl&-stq) procedure of Chomczynsid, P. ft Sacdii, N., 
BioOimL 762:156-159 (1987). RNAs were fiactionated by electrophoresis 
on 1% agarose; 2.2 M formaldehyde gels (Lehradi, H. et aL^ Biochenu I&AIAZ- 
4751 (1977)), transferred to i^lon membr^e (Hybond N, Amersham Corp.) and 
immobilized by baking for 2 hrs at SO'^C. 

Probe Preparadon and Hybridizadon 

Lasp-1 probe corresponded to a 1.0 kb BaniHi firagmrat released fit>m 
MLNSOsiibdonedintopBluescript The RNA loading control probe 36B4 was 
an internal 0.7 kb Pstl fragment (Masiakowski, P. et a/.. Nucleic Acids Res, 
70:7895-7903 (1982)). 

Ncxtiieni Uots were hybridized at 42*'C in 50% formanude, 5x SSC, 0.4% 
ficoO, 0.4% potyvm^pyrrolidone, 20 mM sodium phosphate pH 6.5, 0.5% SDS, 
10% dextran sul&te and 100 fig/ni denatured safanon sperm DNA, for 36-48 hrs 
widxthe^-labded probe diluted to 0.5-1.10^ cpm/ml. Stringent washings were 
performed at eO'^C in 0. Ix SSC and 0. 1% SDS. Blots were autoradiographed at 
-80*Cfor24hrs. 
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Sequwice anafyses were performed uang the GCG sequence analysis 
package (Wiskonsin padcage veraon 8.0, Genedcs conqniter Groap, Madison, 
WI). The Lasp-1 cDNA and amino add sequences weie used to search the 
conq)l^e combined GenBank/EMBL database and the conqdete SwissProt 
database with BLAST (Ahschul. S.F. ei aL, J. MoL BioL 2/5:403-410 (1990)) 
andFastA^»eaisont WJR. & Lipman, D.J., Proc NatL Accd. Sd. USA «5:2444- 
2448 (1988)) programs, respectivdy. The LIM motif and consensus sequoices 
ofLasp-1 were fintfaer identified by die motif program in the PROSTFE dictionary 
(releasel2). The sequence afignments were obtained automatically by using the 
program KleUp (Feng. Df . & Doolitde; ILF.. J. MoL EvoL 25:351-360 (1987)). 

i^subs and IKscussum 

To determine Lasp-1 mRNA distribution we carried out Northern blot 
ana^ uang the d>NA as a probe. A an^e 4.0 Id) mRNA band was detected 
at low levd in all the human tissue and cell lines studied (Fig. 13 and data not 
shown). La^l iiiRNA ovenxpnssion was found in 8% (5/61) primary breast 
cancers (Fig. 13(A), lane 8) and in 40% (2/5) breast cancer dorived metastatic 
lymph nodes (Fig. 13(A), lanes 1 and 2). No expression (0/15) above the basal 
level was fisund in nonmalignant breast tissues (Fig. 13(A), lanes 13-17, 
fibroadenomas; lane 18, hyperplastic breast) nor in normal adult tissues ^g. 
13(6), lanes 1-6 and data not shown). By compatiscm with c-er£'B-2 
overeiqiresaon, Lasp-1 was found to be coesqnressed in most CRg. 13(AX lanes 
1. 2 and 8; Fig. 13(B), lane 8) but not m aU (Fig. 13(A), lane 12; Fig. 13(BX lane 
7) human breast cancer and cdl lines. These results suggest that Lasp-1 is quite 
ubicpntous at the RNA levd, witii an inareased expression in some breast cancer 
tissue and derived metastatic lymph nodes which is probably caused by gene 
ampMcation coitered around the oerbB-2 locus. 
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The complete Lasp-1 cDNA sequence was established Scorn four 
independent cDNA clones. Both sense and antisense strands were sequenced. 
The loi^est cDNA done contained 3848 bp, a size con^stent with the transcript 
size suggesting that this clone should correspond to the full length cDNA (Fig. 
14(A)) (SEQ ID NO:3). At the nucleotide level, sequence homologies wm found 
with 22 ejqiressed sequences tags (ESTs) (Weinstock etaL, Curr. Opht Biotech. 
5:599-603 (1994X and re&.theran). Some of these sequences are redundant and 
they were mosdy located on the 3' untranslated end of the molecule (Fig. 14(B)). 
Most of these ESTs were establi^ed from (fiflferent human cDNA libraries 
established usmg normal tissues (fetal brain, white blood cells, prostate gland, 
liver, pancreatic islet cells and fetal ^leen). The presmce of La^l transcripts 
in an these sanq>les is in good agreement imth our finding of ubiquitous expression 
of Lasp-l mSNA (Fig. 13 and data not shown). 

The first ATG codon (nucleotide position 76 of Fig. 14(A) (SEQ ID 
N0:3)) had a fevorable context for initiation of translation (Kozak, M, NvcL 
Acids Res. 75:8125-8149 (1987)), and a classical AATAAA poly(A) addition 
signal sequence (Wahle, E. & Kdler, W., Armu. Rev. Biochem. 57:419-440 
(1992)) was located 13 bp upstream of the poly(A) stretdi (Rg. 14(A) (SEQ ID 
N0:3)). Hie deduced open reading fiame encoded a 261 amino add protein, 'with 
a nudeailar wei^ of 30 KD and a pHi of 6.S (Rg. 14(A) (SEQ ID N0:4)). The 
protein diowed several consensus sequences: an anudation site (GGKR, residues 
203-206 of Rg. 14(AX SEQ ID N0:4), several phosphorylation sites by cAMP 
and c^4P dependent proton kinase (RRDS» residues 141-144 of Fig. 14(A), 
SEQ ID N0:4X casern kinase H (SGGE, 139-136; SAAD, 213-216; SFQD, 221- 
224; an of Rg. 14(A), SEQ ID N0:4), protein kinase C (TEK, 14-16; TCK, 33- 
35; SYR, 150-152; all of Rg. 14(A), SEQ ID N0:4)) and tyrosine kinase 
(KKGYEKKPY, 38-45; KDSQDGSSY, 137-144; all of Rg. 14(A), SEQ ID 
N0:4). Moreover, a cystdn rich re^jon was identified as a UM (S^diez-Garda, 
I. & RabWts, TJl, Trends Genet. 9:315-320 (1994)) domain in the N-tenninal 
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pait and a SID ^usacchio et aL, FEBSLetL 507:55-61 (1992)) domain at the 
C-tenninal portion of the protdn. 

The deduced primaiy sequence of Lasp-l contains two likdy tyroane 
phoqjhorylation ates (underlined in Fig. 14); these residues are followed by short 
tripeptides demonstrating homology to the predicted SH2 Imiding motif 
(Songyang etaL, Cett 72:767-778 (1993)). 

ASngULJMDommnuPresentatAeN-paHofLa^l 

The IJMdoniain is an anaqgcment of seven cysteine and Mstidine^ 
(f^y^x-C^X^gasr^^^ present in a number of 

invertdiiale and vertebrate proteins. The generic name was ^en for the product 
ofthe three firstly identified UM genes ain.ll,lsl-l and m Thefemilyof 
UM contaiiung protdns is continuously increasing and could be subdivided in 
distinct groups (Sanchez-Garcia, I. & Rabbits, T.a, Trends Genet P:315-320 
(1994)). One group designated UM-HD, includes protein having two UM 
domains associated with a homeodomam (lin-1 1, lsl-1, mec.3). Another group 
designated UM-only, indudes protons exhibiting a single (CRIP), two (CRP, 
TSF3, RBTNl, RBTN2, RBT^G) or tiiree (zyxin) UM domains. RecenUy, a new 
groiq) deagnated UM-K, including proteins having two UM domains associated 
with a kinase domain, had been described (S^chez-Garda, L & Rabbits, T H., 
TrendsGenet P:315-320 (1994); Mizuno etal. Oncogene 9:1605-1612 (1994)). 
Ilie UM domain defines a 23nc bindo^ smicture and zinc bindm^ 
the proper folding ofthe domam. 

Sequence alignments of LIM protdns wtii Lasp-l showed a best score 
alignment with the C elegans YLZ4 putative proton (Accesaon No. P34417). 
Although the overafl homology is low (36% identity and 55% similarity), it is high 
witfiin the UM domain (66% identity and 80% amilarity). The protein YLZ4 was 
identified in the v*ole sequencing of the C elegans chromosome III (Wilson, R, 
e/ot, Naiure 36832-3% (1994)). TheUM domain of YLZ4 does perfectly fit tiie 



wo 97/06256 



PCT/US96/12500 



-91- 

LIM consensus, the first two cysteines are spaced by four instead of two residues, 
leading to a gap in the alignment (Fig. 15(A)), Among other UM containing 
proteins besides the UM consensus sequence, additive homologies were found in 
tfie human cysteine-rich protan-CRP (UAhaber, ei aL, Nucl Adds Res. J8:3S7U 
5 3879 (1990)X the rat c^eine-rich intestinal protdn CRIP and the phyaolo^cal 

fimction of these proteins is not yet known, althougjti a role for CRIP in Intestinal 
zmc absorption has been suggested and CRP was idratified as a binding partn^ 
for a UM-only protein zyxin. The intonaction between these two protdns, 
believed to have regulatory or signaling funcdons in focal adheaon plaques 

10 (Crawford ei oL, J. CeUBioL 7/5:1381-1393 (1992); Cniwfonl et aL, J. Cell 

BioL /2^:117.127 (1994); Sadler ^lai,j: CettBioL 779:1573-1587 (1992)), is 
mediated by sequence-specific tnteracdons between their UM domains (Shmdchd 
ft Bedcerle, CeU 79:21 1-219 (1994)). The LIM domain can be con^dered as a 
protein/protdn modular binding interface dmilarly to SH2 and SID domains 

15 (Shmeicfad & Beckeries, Cfe// 7P:21 1-219 (1994)). Our findings showing a strong 

conservation for Lasp-1 LIM domain across a wide range of diffoCTt spedes 
mammals, nematodes and plant suggest an important fimction for this domaia 

Lasp-1 CoiUains a SH3 Domain at the C4enmnal Part 

The SID (src homology re^on 3) is a small protdn domain of 60 ammo 
20 adds, first identified as a conserved sequence in the N-tenmnal noncatalytic part 

of the src protein tyrosine kinase (Sadowski ei aL^ MoL CeU. BioL 5:4396-4408 
(1986); Mayer ei aL, Nature 332:272-275 (1988)). A number of proteins 
invohred in the tyrosne kinases sgnal transduction pathway contain SID domains 
(Schles^ng^, Curr. OpiiL Genet Develop. -/:25.30 (1994)), tUs domam could 
25 also been fiiund in protdns of unrelated fiinctions such as cytoskdeton assodated 

protdns (Musacduo ei aL, FEBSLeiL 307:SS^\ (1992)). The fimction of the 
SID domain remains unclear, however, SID containing proteins are usually 
located close to the plasmic membrane suggesting a role for this domain in the 
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taig^g of protdn to this cellular compartment (Musacchio et al., FEBSLett. 
307:55-61 (1992)). Direct evidences of the adaptor molecule Gib2, SIB domain 
targeting properties, were provided (Bar-Sagi et oL, Cell 7-/:83.91 (1993)). 
Hints to the fimction were achieved by the resolution of several different SH3 
domains, showing that the overall structure is conserved and indepoidratly folded. 
Also, several protdn ligands for the SH3 domains of oncogenic tyroane kinases 
have been isolated, leading to the definition of spedfic proline-rich regions 
Required for the binding to SHB domains (Alexandropoulos et aL, 

^92:3110-31 14 (1995) and reft, therein). 

Sequence aligmnent revealed honiotogy of the Las^l C^o^ 
several SH3 contaimng proteins (Fig. 1S(B)X inchufing in the SH3 domain of 
EMSl (Schuuring et oL, Oncogene 7:355-361 (1992)) a human homolog of the 
mtyn>dne kinase substrate cortacdn(Wue/aiL,A^^ CeUBioL 77:5113-5124 
(1991)). The strongest consCTvation was found with the YLZ3 putative protein 
of C elegans (Acoesaon No. P34416), the overall homology is low (23% identity 
and 40% anularity) but agnificant within the SID domain (57% identity and 74% 
amilarity). This protein was deduced firom the whole C elegans chromosome m 
sequencing. Interestingly, on the F42H10.3 cosmid the gaie encoding YLZ3 lies 
next to the gene encoding YLZ4 vAidi contained a LiM domain strongly homolog 
with that of Lasp-1 (Fig, 15(A)). This may reflect modular evolution process^ 
leading to join in the same protein fimctional donuitns separated in proteins from 
primitive organisms. 

In conclusion, Lasp-1 carries a LIM domain and a SH3 domain. These 
domains are invoh^ m protein/protein interactions occurring in di&rent ceDuIar 
processes indudiqg development, transcription, transformation and cdl signalin£ 
LIM domains have been shown to be assodated with two distinct fonctional 
domains, the homeo and kinase domains. SID domains are often found in 
assodation with SH2, pleckstrin homology (PH) and kinase domains. A link 
between UM and SIB domains was found by the mteraction of the cytosquelettal 
protdn paxiDin (LIM only protdn) with SH2 and SID domains of vinculin and the 
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focal adhesion kinase (ppl25"^. To date Lasp-1 is the first protem containing 
both domains and could represent the first member of a new protein family of 
adaptor molecules involved in cdl signaling. The ubiquitous expression of Lasp-1 
in human aduh tissues suggests a ba^c cdlular fimction for tUs proton, moreover 
its overexpres^on though gen^c amplification in 10 to 15% of human breast 
cancer suggests Aat Lasp-1 could be implicated in cardnogenesis or tumor 
progression. 

Example4 

MLN 64, a Gene Co-Expressed with the c-er&B-2 Oncogene 
in Malignant Ceib and Tissues 

Iniroduedon 

In Example 1 above, we describe isolating human MLN 64 cDNA fix>m a 
metastatic breast cancer cDNA library. This clone was identified through a 
differential screening performed by using two subtractive probes, respectively 
rqsresentative of m^astatic and nonmalignant breast tissues^ in order to identify 
new genes susceptible to be ^edficaily involved in breast cancer. 

We mapped MLN 64 at the ql2-q21 re^on of the long arm of 
dtfomosome 17 with a maximum in the q21.1 band {see, supra. Example 1). This 
region already indudes two genes known to be involved in breast cancer (fisease, 
fte oncogene c^6B-2 (SIamon» DJ. et aL, Sderwe 235:\rJAZ2 (1987)) in ql2 
and tte tumor suppressor gene BRCAI (Hall, JAl. e/d:. Science 250:1684-1689 
(1990); Brown & Solomon, Curr. Opm. Genet Dev. -/:439-445 (1994), and reft, 
therein) in q2L c^r£B-2 overexpres^on is correlated with a shorter overall and 
disease fi:ee sunoval fi>r breast csraxx patients (N&iss, H.B .etaL^N. EngL J. Med 
500:1260-1266 (1994), and refs. therein). Moreover, c^r6B-2 overexpression 
has been ^own to be dependent of gene amplification during carcinogenesis (van 
de Vijver, M. et al, MoL Cell Biol 7:201-223 (1987)). We established in 
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Example I that the MLN 64 gene was co-amplified vnth the o^rbB-2 gene in 
SKBR3 and BT474 breast cancer cdl lines. It is assumed that DNA an^lification 
plays a crudal role in tumor progresaon by allowing cancer cells to upregulate 
numerous genes 0L6nn, U. etaL, Iml J. Cancer 58:AOA5 (1994); KaDioniemi, A. 
etaL^PmalbU. Acad ScL USA 97:2156-2160 (1994)X and notably oncogenes. 
Frequency of gene amplificatbn as well as ^e copy number increase during 
breast cancer progresaon, notably in patients who do not respond to treatment, 
suggesting that overe3q>ression of the amplified taiget genes confos a sdectWe 
advantage to maKgnam celb (Schwab, M. & Amler, L, Gene^ C%r%m 
7:181-193 (1990); Lonn. U. etaL, hUL J. Cancer 58:AX^5 (1994); Guan, XY. 
etoL^NoL Genet «:1SS-161 (1994)). 

BRCAI is responsive for about half of die inherited fonns of breast 
cananomas, suggesting tiiat oOer tumor suppressor gene(s) could be impGcated 
(Mild, Y. et oL, Science 266:66-71 (1994)). BRCM has been shown to exhibit 
various pos^le disease-cauang alterations including fi-ameshifts and nonsense 
mutations (Castilla etaL, Nat Genet «:387-39I (1994); Friedman et aL, NaL 
Genet «:399-404 (1994); SaanaAetaL, Nat Genet 5:392-398 (1994)). 

Finally, in ^pcnadic primary breast cardnoroas, various sites of DNA 
mutation, ddetion or amplification have been reported in the ql2-q21 r^on of 
tiie dnomosome 17 (Kirchweger et aL, ML J. Cancer 56: 1 3-1 9 (1994); Futreal 
et aL, Science 266:120-122 (1994); Guan, X.Y. et aL, Nat Genet ft 155-161 
(1994)). In this context, MLN 64, which is located in ql2-q21 region of the 
diromosome 17 and amjdified and overexpressed in breast cancer cdl lines, m^ 
be invoked in molecular processes leadiiig to breast cancer development and/or 
progressbn. 

In the present Exampte, we characterized the MLN 64 cDNA, protein and 
gene organization, and investigated the MLN 64 gene expression b a panel of 
normal and malignant human tissues. 
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Materials and Methods 

Tissue and Cell Line CoUecdans 

Dq)ending cm subsequem analy^s, tissues ^ 
in fiqiud nitrogra (RNA extractionX or fixed in fonnaldefayde and paraffin 
5 embedded (m sriu hybridization and inununohistology). Frozen tissues 

stored at -80**C i^ereas parafGui-enibedded tissues were stored at 4''C. 

The mean age of the 39 patients included in the present study was 55 
years. The mmn characteristics ofthe breast carcinomas were as followed: SBR 
grade I (13%), grade H (38%), grade HI (49%); estradiol receptor positive (25%), 
10 negative (75%); lymph nodes without invaaon (39%), with invasion (61%). 

RNA Isolation and Analysis 

Total KNA prepared by a single-step method using guanidinium 
isothiocyanate (Chomczynski, P. & Sacchi, N., AnaL Biochem. /tf2: 156-159 
(1987)) was fractionated by agarose gel electrophoresis (1%) in the presence of 
15 formalddiyde. After transfer, lU^f A was immobilized by heating (12 hrs, SO^'C). 

Filters OaybondN; Amersham Corp.) were adcfified (10 min, 5% CH3COOH) and 
stained (10 mm. 0.004% methylaie blu^ 0.5M CHjCOONa, pH 5.0) prior to 
hybricfization. 

The MLN 64 probe described in Example 1 corresponding to the fiiU- 
20 length human cDNA (nucleotides 1-2008), doned into pBluesciipt n SK-vector 

(Stratagme) was '^-labeled using random prinung (-10 ^m/ng DNA) 
(Fdnberg, AP. & Vogdstdn, B., AnaL Biochem. 737:266-267 (1984)). RItcrs 
were prehybridized for 2 hrs at 42*C in 50% formanude, 5x SSC, 0.1% SDS, 
0.5% PVP, 0.5% FicoD, 50 mM sodium pyrophosphate, 1% glydne, 500 ^g/ml 
25 of ssDNA Hybridization was for 18 hrs under stringent conditions (50% 

formamide, 5x SSC, 0.1% SDS, 0.1% PVP, 0.1% Ficoll, 20 mM sodium 
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pyrophosphate, 10% dextran sulfiite, 100 /zg/ml ssDNA; 42'*C). Filters were 
washed 30 min in 2x SSC, 0.1% SDS at room temp^ature, foDowed by 30 min 
in 0.1% SSC, 0.1% SDS at After dehybridizatioii, filters were rehybridized 
with a c-er3B-2 spedfic probe. The 36B4 probe (Masiakowski, P. et aL, Nuckic 
Acids Res. i0:789S-7903 (1982)) was used as positive internal control. 
Autoradiogrq>lQr was for 2 days for Iqfbridizations of MLN 64 and o€r&B-2 
i^diereas 36B4 l^ridization was exposed for 16 hrs. 

Genome DNA bolaSon ondAnafyas 

Genomic DNAs (10 mg) fit>m human leurocytes and fiom monk^, pig, 
rabbit, rat» hamster, mouse, diicken, fly and worm were digested iKddi£coIU or 
TaqJ, fiactionated by agarose gd dectiophoresis (0.8%), and transferred to nylon 
membranes (Hybond N*, Amersham Corp.). The hybridization conditions for 
Southern blots wctb identu:al to those previously desoibed for Northern blots. 

Preparadon of Monoclonal Aniibo^es and Immunohisiochendslry 

The synthetic peptide PC94 corre^onding to 16 AA (anuno add(s)) 
located in the C-tennmal part of the putative MLN64 protdn (FIG. 16) was 
synthesized in soDd phase usbg Fmoc diemistiy (Modd 43 1 A peptide syntfaedzer. 
Applied Biosystems, Inc., Foster City, CA), verified by anuno add analy^ 
(Modd 420A-920A-130A analyzer system; Applied Bbsystems, Lie.) and coupled 
to ovalbumm (Sigma Chemical Co., St Louis, MO) through an additional NH2- 
ntraterminal cysteine readue, uang the bifimctional reagent MBS (Aldrich 
Chemical Co., Milwaukee, WI). 

Two 8-wedcs-old female BALB/c mice were injected intiaperitonealty 
with 100 of coupled andgoi every two wedcs until obtention of positive 
antisera. Four days before the fiiaon, the mice received a booster injection of 
antigen (100 //g), and then 10 fig intravenous and 10 /zg intraperitoneal route 
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every day imta ^leen removal. The splewi cells wwe fused with Sp2/0"Agl4 
raydoma cells accwding to St Groth & Schddegger, J. ImmunoL MeOu J5;l-21 
(1980). Culture supeniatants were screened by EUSA using the unconjugated 
peptide as antigen. ^ Positive culture media were then tested by 
inununocytofluorescence and Western blot analysis on MLN64 cDNA transfected 
COS-1 cells. Five fayfaridoma^fibund to secrete antibodies sped^ 
MLN 64, were cloned twice on soft agar. They aO corresponded to IgGl, k 
\^ subdass of immunoglobulins (Isotypuig kit, Amersliam Coq?.). 

Imnmnohistocfaeniical anaiyds was performed as previously described 
(Rio, M;C. et oL, Proc NatL Acad Sci. USA W:9243-9247 (1987)) using 
jparaflBn^embedded tissue sectimis. Hybridoma supernatant was diluted 2-fbld and 
a pero»dase-antiperoxidase system (DAKO, Caipinteria, CA) was used for the 
revdadoa 

In SStu HybrWzadon 

In situ hybridization was performed using a ^'S-labded antisense RNA 
probe (5x10* q>m//zg) !q)edfic of the human MLN 64 cDNA. Fonnaldehyde-fixed 
parafiSn-embedded tissue sections (6 iiva thick) were dqiarafiSned in LMR, 
rehydrated and digested with proteinase K (1 //g/ml; 30 min, 
Hybridization was for 1 8 hrs, followed by KNase treatment (20 A^g^ml; 30 nun, 
37*C) and striiigcntfy washed twice (2x SSC, 50% formanride; 60*C, 2 hrs). 
Autoradiogr^hy was for 2 to 4 weeks using NTB2 emul^on (Kodak). After 
exposure, the ^esweiiedevdoped and countmtained using toU^ 
labded sense transcript fiom MLN 64 was tested in parallel as a negative control. 

MLN 64 Genomic DNAaoning 

I%y/igofhuman genomic DNA was partially digested with iSat/5A After 
size selection on a 10-30% sucrose gradient, inserts (16-20 kb) were subcloned 
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at the BamHl replacement ate in lambda EMBL 301 (Lathe, R. et al.. Gene 
57:13-201 (1987)). 2.5x10^ recombinant dones wctc obtained and the libraiy was 
amplified once. One million pfii were analyzed in duplirate for the presence of 
genomic MLN 64 DNA,ujang a 5' and a Tend specific MU^ 64^^ TheS' 
probe was obtained uang amplified DNA firagment (nucleotides 1 to 81) and the 
3* probe corresponded to an EcoJH fragment encompassng MLN 64 XYZbp 
(nucleotides 60 to 2073). Ten and 18 dones gave a positive signal with the 5' and 
3' probe, respectively. After a second screening, 4 dones, hybridizing with the 
two probes, were subdoned mto pBluescript 11 SK- vector (Stratagene), 
sequenced and portioned with respea to the MUf 64 d)NA sequ«^ 

RT-PCR - Sequeneing Reacdom 

MLN 64 gDNA dones and genomic subclones prepared as described 
(Zhou, C. et al, Bioiechniques «:172-173 (1990)) were fiirther purified with 
RNaseA treatment (10 usfnA; 30 min, 37"C) fi)Howed by PEG/NaCI precipitation 
(0.57 vol., 20%, 2 M) and ethanol washing. Vacuum dried pellets wctc 
resuspaided at 200 ngfptl in TE. Double-stranded DNA templates were then 
sequenced witfi Taq polymerase^ uang either pBluesaipt universal primers and/or 
internal primers, and dye-labded ddNTPs for detection on an Applied Biosystems 
373 A automated sequencer. 

Coif^uier Anafysis 

Sequence analyses were performed u^g the GCG sequmce analy^s 
package (Wsconsm Package, versaon 8, Genetic Computer Group). The MLN 
64 cDNA sequence and its deduced protdn were used to search the compile 
combmed GraBank/EMBL databases and the complete SwissProt database 
respectivdy, with BLAST (Altschul, S.F. et al, J. Mol Biol. 275:403-410 
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(1990)) and FastA (Pearson, W.R. & Lipman, DJ., Proc. NatL Acad &i. USA 
£5:2444-2448 (1988)) programs. 

ResuUs 

Deierminaiion of Human MLN 64 cDNA and PutaOve Proidn 
Sequences 

The complete MLN 64 cDNA sequence has been established from six 
independent dDNAs, coming from a ttssular cDNA library constructed using 
human metastatic axillary lymph nodes (Example 1). For each clone, both sense 
and antisense strands have been sequenced. The fiilHength MLN 64 cDNA 
contained 2073 bp (Rg. 16) (SEQ ID N0:5). The first ATG codon (nucleotides 
169-171) had the most &vorable context for initiation of translation (Kozak, M., 
NucL Adds Res, 75:8125-8148 (1987)), and an AATTAAA poly(A) addition 
agnal sequmce (nucleotides 20S0-20S6 of SEQ ID N0:5) (Wable, & & K^et, 
W., Annu. Rev. Biochem. tf/:41-40 (1992)) was located 24 bp upstream of the 
poly(A) stretdi. Thus, the open reading frame encodes a 445 anuno add (AA) 
protein (Kg. 16)(SEQlDNO:6),imthamolecularweight of 50KDandapHi 
of 8.2. DNA database searches reveal homology with various human expressed 
sequence tags (ESTs) identified m libraries established u^g either adult (heart), 
postnatal (brdn) or embiyo placenta, liver, spleen and brain). Moreover, 75% 
homology was observed with the cDNA sequence (606 bp) of the clone plO. 15, 
recently identified through dififerential screening of a rat osteosarcoma cell line 
cDNA Bhraiy (W^e & Li, ^. CellBiochem. 5^:273-280 (1994)), suggesting tiiat 
MLN 64 could correspond to the human homolog of the rat pl0.15. 

Surprisingly, protein alignment revealed that the homology between the 
two putative proteins was restricted to the last 21 C-terminal AA of MLJ^ 64 
which were identical to 21 AA located at the core of the plO. 15 protein (Waye & 
U CettBiochem. 54:2T3-2%0 (1994)). A care&l examination of botfi putative 
protdns has been performed and showed that they result fix)m diflferent open 
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reading frames including only 21 codons in common (Waye & Li, j: CeU 
BiocheoL 5^:273-280 (1994)). MLN 64 exhibited 29% identity and 55% 
analarity with the Caenorhabditis elegans U12964 putative proton of unknown 
fimction (Waterston R., direct subnrission). The putative MLN 64 protein analysis 
showed potential sites (reviewed in, Kemp, BB. & Pearson, RJB., Trends 
BiochenL Sd. 75:342-346 (1990)) specific of N-glycos^on (NESD, residues 
219-222; mTV, residues 311-314; both of F^g. 16, SEQ ID N0:6). 
iduMidioijMm by casein kinase n (SFFD, residues 94-97; SFPE, residues 209- 
212; SDNE, residues 217-220; SDEE, residues 221-224; SAQE, residues 232- 
235; SPUD, residues 343-346; TMFE» residues 426429; aO of Pig. 16, SEQ ID 
N0:6X protein kinase C (SPR, residues 343-345; SAK, leadues 370-372; THK, 
residues 375-377; aU ofRg. 16, SEQ ID N0:6X anidation (ACXK, residues 226- 
229; Fig. 6, SEQ ID Na6). Moreover, structural ana^s revealed two potential 
transmembrane domains (residues 1-72 and 94-168 of Fig 16, ^Q ID NO:Q. 
MLN 64 anuno add conq)oation showed 1 1.5% of aromatic residues (Phe, Trp 
and Tyr) and 26% of aliphatic readues (Leu, He, Val and Met). A carefiil 
exannnation of q>acing of these aliphatic readues has been performed in order to 
detectaposaUecw&mnancepftiiem. Hie Leu readues are principal^ distributed 
in tiie 200 N-temnnal AA (37 Leu), between AA285 and AA328 (7Leu/43AA) 
and AA406 and AA441 (7LeuA35AA). No consensus laidne zippw (reviewed in, 
Busdi & Sassone-Cora, Trends Genet tf:36-40 (1990)) nor leudne-rich rq>eats 
(Kobe ADeisenhofer, Trends Biochan. ScL 77:415-421 (1994)) could be drawa 

MLN64variaHa 

The tissular d>NA library was obstructed using metastatic axillary lymph 
Kxtesccxniiigfrnn four distinct patioits. Sbc indq)endait MLN 64 cDNAs have 
been denied from this library and sequenced. We observed a Ugh degree of 
variabifily between tiidr sequences. Thus, we observed two substitutions, of a C 
to T (mideotide 262) and A to G (nudeotide 5 1 8), changing heu to Phe (AA32) 
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and C%i to Arg (AAI 17), respectively (Table VI, variants A and B). Another 
cDNA presented a 99 bp ddetion (nucleotides 716-814) leading to the deletion 
of 33 AA (AA184-AA216) and to a 412 AA putative protdn (Table VI, variant 
C). RnaDy, one done exhibited a 51 bp insertion (between nucleotides 963-964) 
generating a stop codon 48 bp downstream of the insertion site and giving rise to 
a281 AAdiimeric(>teiniinal truncated protein containing 16 aberrant AAs at its 
C-terminal part (Table VI, variant D). These results showed that, at least 4 
modifications occur in the MLN 64 open reading firame. Since genes exliibitmg 
genedc and qngmetic DNA alterations leading to protdn modifications and 
presumabty to loss of fimction could play a role in tranrformation and/or canca:* 
piog[es^nQoQXisGneiaL,Ama'.J.PiahoL J43:S67'S74(1993);KstagaieiaL, 
Qaogmel CeBGenel 69:39-44 (1995)) and in order to avoid the posal)ility that 
the observed variations result fiom cDNA library artifacts, we decided to redone 
MLN 64 cDNAs fix>m a second fibraiy establi^ed uang SKBR3 breast cancer cell 
15 line (unpublidied data). 

Twenty-five new MLN 64 cDNAs were cloned and MLN 64 specific 
primers were designed in order to identify, uang PCR, the presence of 
insertion/ddetion variants identical to those previously isolated fiom the tissular 
library. AnK>ng the 25 dones, 6 showed modified sizes consent with already 
20 identified ddetionAnsertion events n^ereas the 1 9 remaining clones showed a size 

identical to that of the wild type MLN 64 cDNA (data not shown). Sequence 
analyses of the 6 variant clones showed that they all contained a C at nucleotide 
262 position and an A to G substitution at nudeotide 518 portion (Table VI, 
variant B), suggesting that angle nudeotide variations obs^ved in the MLN 64 
25 clones isolated fiom the tissular libraiy could correspond to individual 

po^onorphismance the library was established u^g tissues Four 
dones presented a 99 bp ddedon (nudeotides 716-8 14X a modification previou^ 
observed in cDNAs cloned fit)m the metastatic libraiy (Table VI, variant C). In 
addition to the 99 bp dd^on, one clone exhibited a 13 bp ddetion (nucleotides 
531-543) generating a firameshift and giving rise to a 247 AA chimeric C-tominal 
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tnuicated protein containing the 121 N-te^^ 126 aberrant 

AAs at the C-tominal part (Table VI, variant F). A 657 bp insmion (between 
nudeotides 963 and 964) was obsoved in another done \vfaidir^^ AA 
C-tnincated protein (Table VI, variant E). The remaining done ^owed three 
modifications, a 137 bp ddetion (nucleotides 1 15-251) leading to the loss of the 
initiating ATG codon, the aheady described 13 bp ddetion (nudeotides 53 1-543) 
and a 199 bp insertion (downstream nudeotide 715). Since the jfirst potential 
ATG codon is located at nudeotides 1087 to 1089, this done could pos^ly 
encode a N-t^nunal truncated protdn containing the 1 3 8 C-temunal AA of the 
MLN 64 (Table VI, variant G). Thus, in addition to the variants previoudy 
observed m the tissular cDNA library, we obsaved 3 novd MLN 64 variants in 
the cdhibir cDNA library. All studied clones presented a polyA+ exduding the 
possibility that insertions opuld corre^nd to un^Itced pre-messei^ger KNAs. 
The identification of 2 identical inxnaxAs (Table VI, variants B and C) isolated from 
the 2 distinct libraries, showed that they are not due to cDNA hbrary arte&cts but 
to cDNA modifications spedfic of the MLN 64 gene. The putative nonsense 
protein sequmces present in variants D, E and F sliowed no homology with 
already known protein sequences contained in databases. 

In order to detemiine if these variants were spedfic of malignancy and 
since MLN 64 was expressed in placenta (^e^, mfra), we used a human d)NA 
placenta library (J.M, Gamier, unpublished data) to search fijr variants u^ the 
same PGR protocol as for the previously described SKBR3 library screoung. 
Nine independent dones have been identified and diedced for altemative splidng 
events. The inddenceofvariarits was lower than in trarisformed tissues since oiity 
one variant corresponding to the insertion of 199 bp, alrea^ identified in 
malignant tissue, was found. 
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MLN 64 Gene Organizadon 

A human leukocyte genomic library was screened usiiig two jwobes 
corresponding to nucleotides 1-81 (Fig. 16; SEQ ID N0:5) obtained by PCR 
anqdificaticai and to ^ almost fiiO-length MLN 64 cDNA (nudeotides 60-2073), 
respectively {see Materials and Methods). One hundred and six clones were 
hybridized, leading to the obtention of positive signal with one of the two probes. 
No clones showed amuhaneous hybridization vwth both probes. Four dones 
hybridized with die smallest probe. They aU contamed a 6 ]d> insert which was 
sentenced usii% internal primers in order to determine the exon/intron boundaries. 
Four odier dimes faybri^zed to the longest probe. Bamm digestion of the inserts 
gave two fiagmoits (3.5 and 6 ld>) which were subcloned and sequenced using 
various primers in order to map q)lidng sites. The dzes of the introns were 
estimated by sequendng or POl ampBfication of genomic subclones using primers 
located within the cDNA and at exon boundaries. The human MLN 64 gene 
whose total loigth was approximatdy 20 kb, was found to be split mto 15 exons 
(Fig. 17 and Table VII (exon/mtron Nos. 1-14 corresponding to SEQ ID NOSrSS- 
71)). Exon 1 and part of ewMis 2 and 15 contain 5' and 3' untramdated rc^ons of 
Ae MLN 64 gene. Trandated d>NA sequence starts at nudeotide 55 of earn 2. 
IntnM/exon boundaries analyas showed that die 5' spUce donor sequences rdated 
to exons 2 (SEQ ID NO:59), 3 (SEQ ID NO:60), 4 (SEQ ID NO:61X 6 (SEQ ID 
NO:63X 9 (SEQ ID NO:66) and 13 (SEQ ID NO:70X and die 3' spfice acceptor 
sequences rdated to exons 2 (SEQ ID NO:59X 3 (SEQ ID NO:60X 6 (SEQ ID 
NO:63X 1 1 (SEQ ID NO:68) and 12 (SEQ ID NO:69) did not correspond to die 
canonical splice consensus sequence (Breadinadi, R. & Chambon, P., Amu. Rev. 
Biochem. 50:349-383 (1981)) (Table VII). 

The cDNA mocfifications leading to the protein variants were all 
cfistr9}uted fiom exon 2 to intron 9. Single nudeotide substitutions were observed 
in exon 2 and 4 (Rg. 17, a and c). The 137 bp and 13 bp ddetions occurred at die 
5' end of die exon 2 (Fig. 17, b) and at die 3' end of die exon 4 (Fig. 17, dX 
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ieq)ectivdy. The 99 bp ddedon concOTied the entire exon 7 17,f). The 199 
bp insertion corre^nded to the 5* end of the intron 6 ORg. 17, e), and the 51 bp 
or 657 bp insertions to tfieS' end or to the entire intron 9 (Fig. 17,gandh). Thus, 
the deletionAinsotion events occurred at the boundaries of intron I/exon 2 (SEQ 
ID NO:58/SEQ ID NO:59X exon 4/intron 4 (SEQ ID NO:61, exon 6/intron 6 
(SEQ ID NO:63X intron 6/exon 7 (SEQ ID NO:63/SEQ ID NO:64) and exon 
9/intron 9 (SEQ ID NO:66), presuniabfy due to the low degree of conservation 
of these splidng ates CTable Vn). 

Moreover, we looked for the conservation of MLN 64 gen^ uang a 
zooblot cmtmnng either £caRI or AimHI digested genonuc DNAs fiom worms, 
fly, hamster, mouse, rat, pig and human. MLN 64 cDNA hybridization gave faint 
and strong signals wiAi invertebrates and vertebrates, respectively (data not 
shown), indicating tiiat MLN 64 is well conserved throughout evolution 
suggesting an important function for this protein. 

MLN 64 is Overexpressed in Human Malignant Tissues 

Nortiiem blot hybridization \i^th the MLN 64 cDNA probe (see Materials 
and Methods) gave a po^tive signal corresponding to MLN 64 transcripts with 
ani^parent molecular wdghtof2kb (Fig. 18,lanesll, 12, 17, 18 and data not 
shown). Moreover, alongertranscriptof3Id> was also detected in samples iKdudi 
contain tiieMgher amount ofthe 2 kb transcripts (Fig. 18, lanes 7, 17, 18 and data 
not shown). After longer autoradiography, two additional spedes of niRNA 
became visible. Polyadefqrlated SNA extracted fix>m BT474 cdl Im^ 
identical pattern of hybridization (data not shown). 

Using Northern blot analysis, MLN 64 overexpresslon was obsoved in 
malignant tumors of breast (14/93 cases), brain (2/3 cases), lung (2/23 cases) 
whereas colon (4 cases), mtestine (1 case), skin (5 cases), thyroid (2 cases) and 
head and neck (25 cases) were negative ((Fig. 18, lanes 7, 1 1, 12, and data not 
shown). Moreover, metastatic lymph nodes derived from breast (2/6 cases), liver 
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(14 cases) and head and neck (1/16 cases) cancers expressed MLN 64, whereas 
those from skin (7 cases), lymphoma (3 cases) and kidney (1 case) cancers were 
MLN 64 negative (Fig. 18, lanes 17, 18, and data not shown). Three fiver 
metastases derived from breast cancer (1/1 case) and colon canc^ (2/7 cases) also 
expressed the MLN 64 whereas one skin and one epploon metastases derived 
from breast and ovaiy canc^, respectively, did not (data not shown). No MLN 
64 transoipts were observed in normal human breast, axillary iytaph node, 
stomach, colon, fiver and kidney, wh^^ feint agnal was observed in skin, hmg, 
head and neck qndermoid tissues and placenta (Fig. 18,kmes ISand 16 and data 
not shown). Moreover, the breast fibroadenomas (13 cases stu(fied), which are 
beiugn tumors, did not show MLN 64 expr^on above the basal level (Fig. 18, 
lanes 1-^. Ahogedwr.tiieseresdts showed that MLN 64 could be overe^ 
in the primary tumors or metastases of a wide panel of tissues including breast, 
colon, fiver, lung, brain and head and neck. Nevertheless, the level of MLN 64 
oversq)ression observed in cardnomas of breast origin was 3-5 fold higher than 
in cancer of other tissues. 

Since in breast cancer cell fines, the MLN 64 overexpression was always 
correlated with those of the eibB-2 oncogene, successive hybridizations of the 
same filters with a c-«rfrB..2cDNA probe have been performed. InallMLN64 
poative mafignant tissues, we observed an overexpresaon of the erbB-2 oncogene 
(Fig. 18, lanes 6, 10. 11, 16and 17, and data not ^wn). Thus, as in cell fines, 
the two genes were co-expressed in wVo. 

MLN 64 Expresswn is Restricted to Malignant l^U^^ 

In situ hybridization, using an antisense MLN 64 RNA probe, was 
performed on primary breast cardnomas and axiUary lymph node metastases. 
MLN 64 was expressed in the mafignant epithefial ceDs, in m situ (Fig. 19) and 
invade (Fig 19) carcinomas, whereas tumor stromal cells were negative. MLN 
64 transcripts were homogeneously distributed among the positive areas. Normal 
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q)Hhdial cdls did not express the MLN 64 gene, even vAien located at the 
piDxnnhy ofim^vecan^matom areas (Rg. 19 and d^ Aamilar 
pattern of MLN 64 gene expression was observed in metastatic axillary lymph 
nodes fixMn breast cancer patients with expression limited to cancer cells whereas 
noninvolved lymph node areas were n^ative (Fig. 19 and data not shown). 

Using monodonal aniibocfy directed against a MLN 64 s^ynthetic peptide 
(see Materials and Methods), breast carcinoma tmmunohistochemical analy^ 
\y showed MLN 64 stanung restricted to die trans^^ Morrover, 
the MLN 64 protein showed a particular distribution with cytoplasmic 
condensation ates, suggesdrig an oiganite localization for MLN 64 (Rg. 20). 
I Idendcal pattern was observed using the BT474 breast cancer cdl line (Fig. 20). 

IXseussion 

In the present Example, we characterized the MLN 64 cDNA and its 
corresponding protein. In Example 1 above, MLN 64 cDNA was identified by 
difleiuiUal screeniiig of a breast cancer metastatic lymph node cDNA library. The 
MLN 64 protdn which contains 445 AA, showed two potential transmembrar^ 
domains and several potential leucine apper and leudne-rich repeat structures 
previously identified in a number of diverse proteins invoh^ m protein-protein 
int^action and agnal transduction (Busch & Sassone-Cdrn, Genet 6:36-40 
(1990); Kobe & Dds^ihofo, Trends, Biochenu Ski, 77:415-421 (1994)), 
Ahfaough the MLN 64 cDNA presented a high degree of homology with the 
plO. IS cDNA, no homology was observed between the two predicted proteins 
with the excqrtion rf21 AA (Waye & Li, J. CelL Biochem. J-/:273.280 (1994)). 
The U^est degree of homology was for the Caemrhabditis elegans U12964 
putative protein of unknown fimction. 

MLN 64 gene contains 15 exons and the coding region encompasses firom 
the 3' rad of the exon 2 to the 5' end of the exon 15. In Example 1 above, we 
observed that no obvious rearrangements, insertions or deletions affected the 
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MLN 64 gene in a panel of breast cancer cell lines. In these cell lines, the MLN 
64 gene expression was always correlated with MLN 64 gene amplificatioa 

In the present Example, in breast cancer cell and/or tissue^ we identified 
and characterized 7 distinct MLN 64 cDNAs, resulting fi^om nucleotide 
substitutions^ deletions and/or insertions. Interestingly, the cDNA modifications 
prmcipafly occurred at exon^tnon boundaries, suggesting that the MLN 64 
variants result finom defective splidng processes. Consistently, ahnost all the 
concerned splicing she sequences were defective (Breathnach, R. & Chambon, P., 
AnroLRev. Biochem. 50:349-383 (1981)). 

Two variants lead to AA substitution and 5 variants encode N- or C- 
truncated MLN 64 proteins. In addition, 3 of them lead to dumoic protdns 
contaiiung additive nonsense protein sequraces of 16, 20 and 126 AA, 
fcspecfively. Usii«RT-Kai, IMU^ 64 mRNA containing the intron 6 s^ 
has been detected in placenta, showing that, at least in this case, MLN 64 
aheniath^spOdngwasnotatnmsfbnnationspedficevent. It remains to be seen, 
using antibodies directed against appropriate epitopes, if all MLN 64 variant 
KNAs arc effectively translated, specifically in cancerous-tissues and/or natural^ 
occurring. In both pl^ological and/or patholo^cal conditions, alternative 
^cing have been reported to occur in transcription of a panel of genes tnchiding 
those coding fi3r the oestradiol recq)tor (Miksicek, Semin, Cancer BioL 5:369- 
379 (1994) and re&. therdn), the ubiquitous cdl sui£u:e glycoprotein CD44 (Arch 
ol. Science 257:682-685 (1992); Joensen et al., Amer. J. PathoL 143'M7-S74 
(1993)), the m^oprotease/diant^grin-like protdn MDC ^Cata^ et a/., 
CytogeneL Cett Genet. d»:39-44 (1995)) and tiie tumor suppressor p53 (Han & 
Kulesz-Martin, NucL Acids Res. 20:179-181 (1992)). Although tiie biological 
significance of these variants was not always well established, their presence m 
transformed tissues is usually assodated with a poor prognosis and a high 
metastatic potentiality (Miksicek, Semin. Cancer BioL 5:369-379 (1994). 

Using Northern blots, we observed two major messenger sizes at 2 kb 
conastent witii the wild ^e ARNm, and at 3 kb, only observed in the tissues 
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hi^i]y€3q>resangthe2kb]nRNA. Human normal skin, lung, head and neck and 
placenta e?q}ressedMLN 64 at a low level, i^ereas breast, lymph nodes, stomach, 
colon, liver, kidney and breast fibroadenomas did not Interestingly, skin, lung and 
head and neck are all epidermoid tissues, suggesting that MLN 64 protein could 
play a physiological role in tissues of this origia MLN 64 was overexpressed in 
breast, colon, brain, liver, lung, and head and neck piimary malignapf tumors 
and/or meta s tases, the U^iest level of expresaon bring observed in teeast 
malignant tissues. Thus, MLN 64 vMch is observed in a wide pand of 
transformed tissues^ should be involved in basic process occurring in 
cardnogmesis and/or tumoral progresaon. 

In both breast primary tumor and metastasis, MLN 64 transcripts were 
homogeneously distributed throughout the cardnomatous areas, uriiereas normal 
tissues were negative. Moreover, MLN 64 is expressed m in situ tumors, 
siiggestir^ tiiat it may be invoked m precodms events leading to tumor invasion. 
Monoclonal antibody, directed ^gainst a C-terminally located MLN 64 synthetic 
pqrtide, pemutted us to localize the MLN 64 protein in vesicle-like structures in 
file cytoplasn of the malignant epithelial cells. Using Westmi blot, MLN 64 was 
found in both BT474cdl and cuhure medium extracts. Thus, despite the absrace 
of a hydrophoi»c secretion agnal at the N-terminal part of the molecule, the MLN 
64 is probably translocated across the endoplasmic reticulum membrane via a 
nondassical medianism. The MLN 64 poative bundles also contain F-actine, 
suggesting that MLN 64 is related to the cytoskeleton of the transformed cdls, 
possibly to podosomes. Podosomes are dose contact cell-adheave structures 
regarded as a key structure in inva^ve processes. 

We showed in Example 1 that, in breast cancer cell lines, MLN 64 
overexpressdon is correlated with MLN 64 gene amplification and with oncograe 
erbB-l amplification suggesting that both genes, which are co-localized in ql2- 
q21 on the long arm of the chromosome 17, belong to the same amplicon. 
Consistently, we have now observed, in vivo^ a coexpression of the two genes. 
erbB-2 amplification is one of the most common genetic alteration occurring in 
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breast cardnomas (reviewed in, Devilee & Comelisse Biochim, Biophys. Acta 
7/5:1 13-130 (1994) and reft, therdn) and is associated with a poor prognosis 
(Slamon, DJ. etal. Science 244:101-112 (1989); Muss, HB. etaL, N. EngL J. 
Med 530:1260-1266 (1994)). It is currently admitted that gene 
amplification/overexpression confers a preferential growth to the cells and 
concerned the oncogenes (Schwab, M. & Axder^ L., Genes. Chnm. Cancer 
7:181-193 (1990); Kalliomemi, A. et aL, Proa NatL Acad ScL USA P7:2156- 
2160 (1994)), whereas, the variants resulting in dramatic mocfification of the 
protdn pemut a growth of the cells by inactivation of protdns inducting tumor 
siqypressor genes ^Culesz-Martin ei aL^ MoL Cell Biol 7-^:1698-1708 (1994); 
K3ta^eiaL, Cytogenet Cell Gene In this context, it may 

, be paiadoxical that the MLN 64 gm which is anq>Iified showed numerous variant 
spedes. ^Vhat could be the eEBdency of amplification if the product of the target 
amplified gene is defective? Whatever the medianism(s), ^ce genes showing 
IS amplification leading to overexpresaon or alternative splicing leading to defective 

protdns ^fiksicdc, Semin. Cancer BioL 5:369-379 (1994)) are most often 
strongly related to cancerous processes, our results suggest that MLN 64 may 
parddpate in cardnogweas and/or tumor progression. Since it has recendy been 
proposed that the oncogenic properties of erbB-2 could be increased by the 
20 overexpres»on of downstream signaling molecules possibly co-localized on the 

chromoson^ 1 7, sudi as GRB7, it is tempting to speculate that MLN 64 could be 
involved in the eibB-2 signaling pathway. 
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Examples 

DefinhioD of the D52 Gene/Protein Family throng Ooning 
of a D52 Homolog, DS3 

Iniyodaeiian 

The human D52 (hD52) cDNA was cloned through diflferential screening 
of a breast carcincHna cDNA Vbmy (Byrne, lA. et al^ Cancer Res, 55:2896-2903 
(1995)). The hD52 gsrte is overejq>ressed in approximately 40% of human breast 
carcinomas, where it is q>ecifically expressed in the cancer cells. The hD52 locus 
has been mapped to diromosome 8q21, a region which Ls frequmtly amplified in 
breast caranoma (Kaffioraenu, A. ol, /Vca JW^ 

(1994) ; Muleris; lA^etdL, Genes Chranu Cancer 10: 160-170 (1994)X in cancers 
of the prostate (Cher, Mi. e/ot. Genes Chrom Cancer 77:153-162 (1995)) and 
bladder (Kaflioniemi, A.eiaL, Genes Chrom. Cancer 12: 213-219 (1995)), and 
in osteosarcoma (Tarkkanen, M. et aL^ Cancer Res. 55:1334-1338 (1995)). 
Acconiingiy, we noted hD52 gene amplification m the bie^ 

SK-BR-3 (Byrne, JA etaL, Cancer Res. 55:2896-2903 (1995)), which has been 
pneviousfy rqKnrted to harbor a chromosome 8q21 ampGfication (Kallioniemi, A. 
et al., Proc NatL Acad ScL USA P7:21 56-2160 (1994)). The predicted hD52 
amino add sequence is highly novel, possesang very little homology with 
sequences thus fer reported (Byrne, J. A. ei al. Cancer Res. 55:2896-2903 

(1995) ). Uang the diflfarential displ^ technique (Uang, P. & Pardee, A.B., 
Science 257:967-971 (1992)), a hD52 cDNA (known as N8) was also recently 
cloned through its di£ferential expression between normal and tumorous lung- 
derived cell lines. 

Comparing the hD52 protdn sequence with translated nucleotide 
sequences in genetic databases identified several expressed sequence tag (EST) 
sequences whidi vAisn translated, showed 48 to 67% identity vnth 24 to 40 amino 
add r^ons of Ae hD52 sequence. These sequences derived fi^om human cDNA 
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clones isolated from adult liver and fetal liver/spleen cDNA libraries by the 
Washington University-Merck EST project. Two such cDNA clones were 
provided by the IMAGE consortium at the Lawrence Dvermore National 
Laboratory (Livermore, California), and the insert of one was used to screen a 
breast cardnoma cDNA library, TMs allowed us to isolate a 1347 bp cDNA 
vAiose coding sequence predicts a 204 amino add protdn vAich is 52% idmtical 
to hD52. On the baas of this homology and similarities existing between putative 
domains in the 2 proteins, we have called this novel gene D53, and propose that 
this represents a second member of the DS2 gene/protein &mily. 

Materials and Methods 

cDNAlibrayScreemng 

Two cDNAs (dones 83289 and 116783, corresponding to GenBank 
Accession Nos. T68402 and T89899, respectively) were gifts from the IMAGE 
consortium at the Lawrence Livermore National Laboratory (Livermore, 
California). The random-primed ^-labeled insert of clone 1 16783 was used to 
soeen 500,000 plaque forming units (pfiis) from a breast cardnoma d)NA Ubrary 
(Byrne, J.A etaL, Cancer Res. 55:2896-2903 (1995)) which had been transfeired 
to duplicate nylon filters (Hybond N, Amersham Corp.). Screening was 
performed baacaDy as previously described (Bass^ P. et aL, Nature 345:699-704 
(1990)), with idratified AZAP n dones bdng rqplated at d»aties allowing the 
isolation of pure plaques, and submitted to secondaiy sheening. Qone ins^ 
were rescued in the form of pBluescript SK- plasnuds using the m vivo exd^on 
53rstem, accordmg to the manu&cturer*s ufistmctions (Stratagene). 

For the isolation of mD52 d5NAs, a CDNA horary was used which was 
constnicted by C. Tomasetto (IGBMC, Iffldrch, Fiance) using polyA"^ RNA 
isolated from apoptotic mouse mammary gland. OligodT-primed cDNAs were 
Ugated with tiie ZAP-cDNA linker-adaptor, and cloned into the Uiu-ZAp™ XR 
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vector according to tiie manu&cturer's protocol (Stratagene). A total of 850,000 
pfus were screened using an EcdRI restriction fragment from the hD52 dDNA 
(containing 91 bp of 5'-UTR and 491 bp of coding sequrace (Byrne, J.A. ei aL, 
Cancer Res: 55:2896-2903 (1995)) at reduced stringency, with final filter washes 
bdng performed u^g 2x SSC and 0. 1% SDS at room temperature for 30 mia 
A migle clone (Ft) was identified. After purification and insot rescue using in 
vmt> exdsion, the ^-labded R insert was used to rescreen the same cDNA libraiy 
filters using the same conditions, in order to identify a fiiU-length cDNA (done 
CO- 

DNA Sequencing 

ftfini-prcparations of plasmid DNA which had been fiirther purified by 
NaCl and poiyethylme glycol 6000 precipitation were sequenced with Taq 
polymerase and dther T3 and/or T7 universal primers, or intanal primers, and 
dye-labeled ddNTPs for detection on an Applied Biosystems 373A automated 
sequence. 

Sequence Anafyses 

Nuddc add and amino add sequence analyses were performed using the 
fonowing programs avmlable in the GCG sequence analysis.package: BLAST and 
FastA, for sequence homology searches; gap, for fiirther sequence alignments; 
Isoelectric, fi>r the calculation of pi values; Motifi, for the identification of 
recognized protein moti&; and Pepcoil, for the identification of coiled-coil 
domains, according to the algorithm of Lupas, A. e/ a/.. Science 252: 1 162-1 164 
(1991). PEST sequences were asagned uang the PEST-FIND algorithm (Rogers, 
S. ei aL, Science 25^:364-368 (1986)), wMch was a gift firom Dr. Martin 
Redistetner, Umveraty of Utah, USA. Other predictions of secondary structure 
were performed uang tiie MSEQ ^lack, S.D. & Glorioso, J.C., BioTeck ^;448- 
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460 (1986)), PHD (Rost, B. & Sander, C, Proteins 79:55-72 (1994)) and PSA 
(Stultz, C^. etoL, Prot ScL 2:305-314 (1993)) software. 

Chromosomal Localization 

Chromosoma] localization of the liD53 grae was performed uang 
chromosome preparations obtained fix>m phytohemagglutinin stimulated 
lynqihocytes. Cdb were cultured for 72 hrs, with 60 pg/nil5-4>romodeo?Qa^ 
having bem added during the final 7 hrs of culture to oisure a postfaybridization 
dinmx>somalban£qg of good quality. For the mDS2 goie, mjsfte hybridization 
experiments were carried out u^g metaphase q>reads from a WMP strain male 
mouse, in wfaidi all autosomes except 19 wa^e in the form of metacentric 
Robertsonian translocations. The 1 16783 (hDS3) done containing an insert of 
842 bp in a modified pT7T3D plaanid vector (Pharmacia), and the CI (mD52) 
clone containing an insert of 2051 bp in pBluescript SK- (Stratagene), were 
labeled using nick-translation to final specific activities of 8x10^ dpm/jig, and 
hybridized to metaphase spreads at final concentrations of 200 ng^ml (1 16783) and 
100 ng^nd (O) of hybri(fization solution as described O^lattd, M.G. et al.. Human 
Genet 6P:268-271 (1985)). Autoradiogr^hy was performed using NTB2 
onulsion (Kodak) for 21 days (1 1 6783) and 20 days (CI) at 4''C. To avoid any 
sfippage of silver grains during the banding procedure, chromosome ^reads were 
first stained with buffered Giemsa solution and the metaphases wa% 
photograqphed. R-banding was p^formed using the fluorodirome-photo^s- 
Giraisa method and m^aphases were rephotographed before analysis. 

CellCabure 

BT-20, BT-474 and MCF7 breast cardnoma cell lines, and tfie leukemic 
25 cell lines HL-60 and K-562 are as described in the American Type Culture 

Collection catalogue (7th ed.). CeU culture media were for BT-20, MEM 
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supplemented with 1 0% feal calf semm (FCSX 2 mM pyruvate, 2 mM glutamine, 
10 |ig/inl inailin and 1% nonessential amino adds; for BT-474, RPMI 1640 
siqjplemented with 10% FCS, 2 mM glutamine, and 10 iig/wl insulin; for MCF7, 
DMEM supplemented with 10% FCS, and 0.6 fig/nd insulin; for HL-60, RPMI 
1640 supplemented with 10% FCS; and for K-S62, RPMI 1640 supplmiented 
with 10% heat-inactivated PCS and 2 mMgKatamine. All cells were cultured in 
the presoice of antibiotics (0.1 mg/ml streptomycin, 500 U/ml penidDin and 
^ 40 |ig/ml gentamydn) at ST'C with S% C02/9S% air in a humicfified incubator. 
For e9q>eriments in whidi breast cardnoma ceQ Enes were cultured in ^ 
estradiol supplemented or dqileted mecfia, cdls were seeded into four 75 cm' 
i flasks at low denaty. These were cultured for one dsy before normal giowdi 
Dsedia were replaced (3 fladcs) or not (one flask) by phenol red-free DMEM 
supplemented with 0.6 gg^ml insulin and 10% FCS which had been treated with 
dextran-coated diarcoal to deplete endogenous steroids. Cells were cultured for 
2 days in steroid-depleted media before this was supplemented (2 flasks), or not 
(one flask), with lO"* M or lO"* M estradiol. Cell culture was continued for 3 
days, at which point cdls were approaching confluence. 

For expoim^its in \^ch HL-60 and K-562 cells w&c induced to 
differentiate u^g 12-0-tetradecanoylphorbol-13-acetate (IPA), cells were 
dihitedto a denaty <^2xl0' cdls/ml and 10 ml volumes were seeded into 85 mm 
diameter culture dishes. At the start of each e9q)eriment, one culture dish was 
inunecfiately harvested for KNA extraction. Media were then supplemented^ or 
not, with 16 nM or 160 nM TPA, and ceDs were cultuied for periods of up to 
48 hrs before harvest for RNA extraction. 

UNA Extradion and Northern Blot Analyses 

Sbmian sui^^cal q)edmens were obtained from the Hopitaux Univer^taires 
de Strasbourg, beirig frozen and stored in liquid nitrogen. Total RNA was isolated 
from tissues and cultured cells as previously described (Rasmussen, U.B. et al.^ 
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CancerRes. 55:4096-4101 (1 993)). Northern analyses were perfonned with 10 
fig of total RNA whidi were dectrophoresed through 1.0% denaturing agarose 
gds and transferred to n^on fihers (Eiybond N, Amasham Coip.) uang 20x SSC. 

Northern hybridizations were peifiuimd uang ^-labded inserts fiom tiie 
116783 hD53 cDNA and the hD52 cDNA (Byme^ J.A. et aL, Cancer Res. 
55:2896-2903 (1995)). To verify the effectiveness of estrogen treatments in 

breast carcinoma ceO Iines» and of TPA treatments in leukeniic ceO lines, we also 
idiybiidized filters with ^•4abded cDNA insots corresprading to the estrogen- 
indudble gene/«S2 (Rio, M.C. e/o/., Proc Natl Acad. Sd. USA 84:9243-9247 
(1987)), and the transferrin rec^r gene (Kflhn, L.C. et tO., Cell J7:95-103 
(1984)X in these respective cases. AU fitters were refaybridized with a '^-labeled 
internal Psa fiagmoit of Ae 36B4 cDNA (Maaakowsid, P. et al., Nucl Acids 
Res. 70:7895-7903 (1982)), representing a ubiquitously expressed gene. 
Hybridizations and washing stqjs were perfonned essentiaDy as described (Basset, 
P. etal.. Nature 3^^:699-704 (1990)). 

Results 

JsOatimt and Sequendng of the Human D53 cDNA 

The existence of a hD52 homolog was ori^nally predicted fi-om 3 EST 
sequences (GenBank Accession Nos. T68402, T89899 and T93647) which when 
translated, showed 24-40 amino acid regions which were 48-67% identical with 
r^ons between amino adds 130-180 of hD52. These ESTs derived fi-om human 
cDNA clones isolated fnsta aduh liver and fetal liver/^leoi cDNA libraries by the 
Washington University-Merck EST project, and 2 of these cDNA clones (dones 
83289 and 116783, corre^onduig to GenBank Accesrion Nos. T68402 and 
T89899, lespectivdy) were kindly provided by the IMAGE consorthmi at die 
Lawrence Ijvomcxre National Laboratory. Sequencing of dones 83289 and 
116783 in both directions indicated that th^ conast of 1626 bp and 842 hp, 
respectively (Rg. 24(A)). Witfun their regions of overtap (714 bp), their 
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sequences were idoitical, except for a deletion of 100 bp in done 83289 
(corresponding to nucleotides 567-666, Fig. 24(B)), and a single T/G 
polymorphism at nucleotides 254 and 371 of clones 83289 and 116783, 
respectively (nucleotide 865, Fig. 240B)). 

Gones 83289 and 116783 vicro found to possess open reading jframes 
extending from their 5*-ends, enco^ng 60 and 99 amino adds, reqpectivdy, and 
tmninating with the same stop codon (Fig. 24(A)). However, because of the 
sequence deletion present m the 83289 done, the fii^t 18 amino adds of tiie 
83289 amino add sequence are fiame-^faifted with respect to those encoded by the 
correspondmg DNA sequence of the 1 16783 done. Thus, the first metluonine 
residue present in the 116783 amino add sequence (Met*", Fig. 24(B), which is 
present in a moderate ftvorable context for translation initiation) is no longer in- 
fiame in the 83289 amino add sequmce. For this reason, and also because the 
lengths of these apparently partial length cDNA clones did not correspond with 
the obsenred transcript aze of 1.5 kb (see, infra), a breast carcinoma cDNA 
library was screened with the 1 16783 clone insert in order to isolate additional 
dones. The shorter 116783 done was chosen for screening, because of the 
presence of an Ahi sequence in the extended 83289 3'-UTR (Fig. 24(A)). 

Of the 14 poative dones tinis identified, 11 remained positive upon 
secondaiy screcaimg, and of thes^ 2 (Ul and SI) possessed additional sequences 
at their 5' ends witii respect to the 116783 sequmce. The insert of the longest 
clone, Ul, was sequraced m both directions. This indicated that the Ul done 
possessed 494 adifitional bp with respect to the 5* extent of done 116783, and that 
this sequence hiduded a strong Kozak consensus sequence (nudeotides 175-184; 
Fig.24^);SEQ]DNO:9). TfaustheUlsequencewasnotedtoconsist of 180bp 
of 5'-UTR, a coding sequence of 615 bp and a 3'-UTR of 552 bp, including a 22 
bp pofyA sequence. The hD52 and Ul coding sequences were found to be well 
conserved (62% identical) over much of their lengtiis, but the predicted 5'-UTRs 
were poorly conserved. It should be noted that as for hD52 (Byrne, J. A. et al,. 
Cancer Res. 55:2896-2903 (1995)X there is no in-firame stop codon present in the 
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Ul SUTR sequence. However, if the reading fiame is continued in a 5' Election 
from the proposed hDS2 and Ul translation initiation sdtes, the resultiiig protein 
sequences oicoded show no homoloj^ to each other. TMs contrasts with the 
protdn sequ«ces encoded after the proposed initiation of translation sites (see, 
infraX where 60% identity/78% conservation of homology is observed between 
the first 170 amino adds of hDS2 and the corresponding region of Ul. We thus 
dedded to term the novel grae corresponding to the Ul cDNA D53, which is 
predicted to encode a protein of 204 amino acids (Fig. 24^); SEQ ID NO: 10) 
having a molecular mass of 22.5 KD. 

Isoladon and Sequencing of a Mouse D52 cDNA 

In order to further define the DS2 &mily and the degree to which these 
sequences may be conserved during evolution, a mouse homolog of the hDS2 
cDNA was dcmed fix>m an sq)optotic mouse mammaiy gland cDNA libnuy. The 
identity of tiie initially isolated 735 bp murine Fl d3NA (Fig. 25(A)) as a D52 
homolog was shown by a high levd of homology noted between its incomplete 
coding sequence and tiiat of hD52 ^yme, J.A et aL^ Cancer Res. 55:2896-2903 
(1995)). Offour longer cDNAs subsequently identified using the ncDNA,th^ 
longest (CI, 2051 bp; Fig. 25(B); SEQ ID N0:1 1) appeared to contain a fiiU- 
kngtfa, 558 bp codmg sequence when conq^aredv^ that of h^^ Thepredicted 
hD52 and mD52 coding sequences are 82% identical, with the latter encoding a 
protdn of 185 amino adds (Fig. 25(B); SEQ ID NO: 12). The remaining 1482 bp 
of the CI cDNA represents 3'-UTR sequence, which is approximately 69% 
identical to the corresponding re^on of the hD52 3 -UTR (Byrne, J.A et aL, 
Cancer lies, 55:2896-2903 (1995)). This homology ends at the polyadenylation 
signal, whose sequence and position is conserved in the hD52 sequence, and 
where its use ^es rise to a minor 2.2 kb hD52 transcript (Byrne, J.A et a/.. 
Cancer Res. 55:2896-2903 (1995)). The CI cDNA tiius appears to represent a 
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mouse homolog to this minor hD52 transcript, its stnicture having apparaitly been 
conserved between hDS2 and mDS2 genes. 

Domain Features Commoidy IdenAfied in D52 Prolan Famify 
Members 

The identity of the UI cDNA as a D52 homolog (termed hD53) was 
confinned upon aligning the predicted hD53 amino add sequence (SEQ ID 
N0:10) with those of hD52 (SEQ ID NO:50) and mD52 (SEQ ID N0:12), as 
shown in Figure 26(A). The 204 amino acids of hD53 are 52% identical/66% 
conserved with respect to hD52. and human and murine D52 homologs are 86% 
identical/91% conserved. The hD53, mD52 and KD52 sequences were fiirther 
examined using a number of sequence analysis programs in order to further 
evahiate the agnificance of these homologies. Due to the previous identification 
of a central r^on displ^g 7-amino add periodicities of apolar amino adds in 
hD52 (Byrne, J.A. etaL, Cancer Res. 55:2896-2903 (1995)), a program was used 
which statistically compares query sequences wth known coiled-coO domains 
(Lupas, A. el ail. Science 252:1162-1164 (1991)). CoHed-coil domains are 
anq[)i^)atiiic (a-lidical domains diaracterized hj^phobic resdues at portions 
a and i/ of an cAcdefg heptad repeat pattern, and fi^uartly also by charged 
residues at portions e and ^ (reviewed in, Adamson, J.G. et al., Curr. Opm. 
Biotechnol ¥:428-437 (1993)). Coaed-coU structures, which represent protein 
dimCTi2ation domains, are formed between 2 coiled-coil domains which adopt a 
supercoil structure such that their nonpolar feces are continually adjacent, and 
both hydrophobic and ionic intCTactions are important for thdr formation and 
stability (Adamson, J.G. et oL, Cvrr. Opin. Biotechnol 4:A!2:t^31 (1993)). 
Putative coiled-coil domains of 40-50 amino adds were identified towards the N- 
terminus of hD53, inD52 and hD52 sequences, and are predicted to comprise 
amino adds 22-71 in hD53 (SEQ ID N0:10) and hD52 (SEQ ID NO:51), and 
ammo adds 29-71 in mD52 (SEQ ID N0:12), as shown in Figure 26(B). It can 
be noted that not all a and portions of the heptad repeats in these predicted 
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coiled-coil domains are occupied by hydrophobic readues (Fig. 26(B)). This 
reflects the &ct that certain deviations from the previously mentioned sequence 
diaracteristics of coiled-coil domains are not incompatible with the formation of 
coiled-coil structures (Lupas. A. et aL, Science 252: 1 162-1 164 (1991); Adamson, 
J.a e/oi, Curr. Opin. Bioiec/moL -/:428-437 (1993)). 

Visual inspection of these 3 amino add sequences followed by conqiutated 
analysis identified a second dom^ type predicted to be present in each protein, 
this being the PEST domain (Rogers, S. ei oL^ Science 25^:364-368 (1986)). 
PEST domains are conadmd to be proteolytic signals, having been identified in 
proteins knovm to have short intracdhilar half-fives Cell 
BioL 7:433-440 (1990)). They are enriched in Pro, Glu, Asp, Scr and Thr 
residues; and are flanked by Lys, Arg or His re^dues^ althougjh in the absence of 
tfaese^tfaeN-orC^tenranusprotdnendisalsoapern^ S.etaL, 
Science 25^:364-368 (1986)). PEST dommns can be objectively found and 
assessed using an algorithm which asagns a so-called PEST score, giving a 
measure of the strength of a particular PEST sequence's candidature. We used 
this algorithm to identify PEST ^gnals, and their sequences and assodated PEST 
scores are fisted m Table Vm (hD52 (AAia40) (SEQ ID NO:72); mD52 (AAIO- 
40) (SEQ ID NO:12); hD53 (AA1.37) (SEQ ID NO:10); hD52 (AA152-179) 
(SEQ ID NO:73); mD52 (AA152-185) (SEQ ID NO:12); hD53 (AA169.190) 
(SEQ ID NO: 10)). Almost all putarive PEST dgnals identified have assodated 
PEST scores of greater than zero, vdudi is conadered to define a PEST sequence 
(Redisteiner, M., Semin. CeUBioUA32AAO (1990)), with only the C-tmninalfy 
located PEST domain of hDS3 representing a weaker PEST candidate. 

A third feature which is common between the 3 sequences is an uneven 
distribution of chaxged amino adds within these. All 3 proteins are predominantly 
addic, with pis of 4.70, 4.75, and 5.58 for niD52, hD52 and hD53, respectively. 
However, while ^proximately the first and last 50 anuno adds of each protdn 
exMbits a predominant negative charge (due in part to the presence of PEST 
domains), the central portion of each protein exhibits an excess of positivdy 
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diaiged readues^ vMi the most frequently occurring charged amino add residue 
being Lys in all cases ^ig. 26(A)). 

Finally, mD52, hD52 and hDS3 protdns possess ates for similar potential 
posttranskdonal modifications, although the frequency and portions of these ates 
are not identical in the 3 sequences. All 3 protdns may be subject to 
N-glycosylatioxi, since in both niD52 and hDS2, Asn*^ is a potential glycos;^ation 
ate» iwtfa Asn^^ being a second potential site in niDS2, \;idimas Asn iS a 
potential ate m hDS3. A number of potaatial phosphorylation sites were 
originally noted m hD52 (Byrne, J.A. eiaL, Comer Res. 55:2896-2903 (1995)), 
and a sinukir anal^ of the potoitial phosphor^ation sites present m niD52 and 
hDS3 reveals that hDS3 tnchides a greater denaty of potential phosphorylation 
sites (14 potffltial ates) than dther mD52 or hDS2 (8 and 9 potmtial shes, 
respectivdy). Moreover, die distribution of these ates m hD53 differs from fte 
pattern observed in mD52 and hDS2, which is largely conserved between these 2 
mdeailes. Of 14 potential phosphorylation sites in hDS3, 4 are also found in both 
mD52 and hD52, and the remainder are distinct to hD53 (Fig, 26(A)). Most 
interestmgly, Tyr*^ of hD53, which is located within a 13 amino add insertion 
with req>ect to the aligned mDS2 and hD52 sequences, is predicted to be 
phosphorj^ated by ^rosine kinase, whereas no such site exists in dther mD52 or 
hD52. 

Homologies Between D52 Protein FamOy Members and Other Anuno 
Add Sequences 

In contrast to the d^ree of homology present b^weoi hD53 and h/mDS2, 
the predicted hD53 amino acid sequMce (Fig. 24(B); SEQ ID NO:10) shows 
relatively little homology wth sequences of described protems, as initiafy 
observed for hD52 (Byrne, J.A. et oL, Cancer Res. 55:2896-2903 (1995)). 
Homology can be identified bettveen the coiled-coil domain of hDS3 and similar 
domams of other proteins, such as yeast ZIPl (Sym, M. et al.y Cell 72:365-378 
(1993)). Lower levels of amino add sequence identity arc observed between more 
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extensive regions of HD53, and protans of the cytoskeleton, or oiher homologous 
proteins. For example, weak homology (20% identity, 34% conservation) was 
noted over 172 amino acids of hD53 with moesin from the pig (Lankes, W.T. et 
al., Biochim. Biophys. Acta 1216A19-AZ2 (1993)), the human (Lankes» W.T. & 
Furthmayr, H, Proc NatL Acad ScL USA «*:8297.8301 (1991)) and the mouse 
(Sato, N., J. CeUScL /(?i:13M43 (1992)). Somewhat higher levels of sequence 
identity (31-36% idratity, 45-51% homology) were noted between amino adds 
139-177, and histpne HI sequences from maize (Razafimahatratra, P. e/a/., NucL 
AckkEes. 79:1491 (1991)) and wheat (Yang, P. et oL, NucL Acids Res. I9:S0TJ 
(1991)). 

Recently, we noted a significantly higher d^ee of homology between 
h/mDS2 and hD53 sequences and that of the putative protdn F13E6,1 mcoded 
between nucleotides 5567-6670 of the Caenorhabditis elegans chromosome X 
cosmid F13E6 (EMBL Accession No. Z68105; Wilson, R. et oL, Nature 368:32- 
38 (1994)). At 257 amino acids in length, the putative F13E6.1 protdn is 
somewdiat longo* than D52 and iD53, with 42 amino adds (amino adds 121-167) 
corresponding to predicted exon 4 of the F13E6. 1 grae not being present in D52 
or D53 sequences. F13E6.1 is most sunilar to hD52, ^ere aligning the 2 
sequences usiiig the programme gap indicates 36.2% identity/45.4% conservation 
of homology ovw the 185 amino adds of hD52. The e»stence of transcripts 
deriving from this or a anular gene is indicated by EST sequoices derivmg from 
cDNA dones from Caemrhabditis elegans (GenBank Accession Nos. D73047, 
D73326, D76Q21 and D76362) and the paraatic nonatode Stnmgyloides 
stercoralis (GenBank Accesdon No. N21 784). In summary, it is posdble that a 
D52 hcxmolog or ancestral gene exists in nematodes. 

Chromosomal Localizations ofDS2 andDSS Genes 

Previous gene mapping studies have indicated a single hD52 locus at 
chromosome 8q21 (Byrne, J. A. et al.. Cancer Res. 55:2896-2903 (1995)). Thus 
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in the preset study we siinilariy deteraiined the chromosomal localizations for 
hD52 and mD52, in order to determine whether human gaie memb^ of the 
proposed D52 &mily are chistered on chromosome 8q, and whethra- this/these lod 
may be syntmically conserved in other spedes. 

In the 100 metq)hase cells examined after 117 57Ai hybridization using the 
hD53 116783 probe, there were 172 alver grains assodated with chromosomes, 
andS7 oftiiesegrains(33.1%) were located on chiDmosome6. The distribution 
of grains on this diromosome was not random, 40/57 (70.2%) of these mapping 
to tiieq22^ r^on(^g. 27(A)). These results allow us to map the hDS3 locus 
to Ae 6q22-q23 bends of the human genome, thus demonstrsiing that mdq)endent 
lod on sqiarate chromosomes exist for the hDS2 and hDS3 genes. 

Using the niD52 Q probe, 153 ^ver grains were assodated with 
chromosomes in the 100 metqdiase cells examined after in situ hybridization, 
Forty*one of these grains (26.8%) were located on chromosome 3. The 
distribution of grains on this chromosome was not random, 35/41 (85.3%) of 
these mapping to the A1-A2 r^on ^ig. 27(B)). A secondary hybridization peak 
was detectable on diromosome 8, ance 30 of the total grains were located on this 
diromosome (19.6%), and the distribution of grains on this chromosome was not 
random, 23/30 of these mapping to the C band. Thus, we were able to d^e 2 
mD52 lod, on duomosome 3A1-3A2, and chromosome 8C of the mouse genome, 
a result which was somewhat unexpected ^ven the exist^ice of a single hD52 
locus. 

The mouse cfaromosrane 3AI-3 A2 toqoh has been reported to be syntaiic 
with regioiis of human diromosome 8q (OBrien. S J, et at, Repm of the 
Committee on Con^xo-attve Gene Mapping, in HUMAN CteNE MAPPING 846 
(1993); Lyon, M.F. & Kirby, M.C., Mouse Genome 93:23-66 (1995)), including 
band8q22at§acenttothehD52geneat8q2l. This suggests that the chromosome 
3A1-3A2 locus is Ae major niD52 locus, and corresponds with the distribution of 
sha grains between the 2 sites, 22.9% of all grains associated with chromosomes 
being found at chromosome 3A1-A,2, compared with 15.0% assodated with 
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chromosome 8C. The significance of the dual mouse D52 lod is currently 
unknown. The diromosome 8C locus may represmt a mDS2 pseudogene, or 
another highly mDS2-homologous gene. While it is currently not pos»ble to 
distinguish between these possibilities^ it would appear fi-om the existence of a 
m^t hDS2 locus that either secondary lod do not exist in the human, or that they 
are co-localized with the primary hD52 locus at human chromosome 8q21. 

Qm^aradye Expnesaon Patterns ofhD52 and hD53 in Human Breast 
Thsues and Breast Cancer Cett Lines 

The expression pattm of hD53 was evaluated in normal adult human 
tissues, breast cardnomas and fibroadenomas, and a number of cell lines uang 
Northern blot analyas. A dngle l.S Id) hD53 transcript was detected in all 
samples positive for hDS3 expres^on (Fig. 28 and data not diown). Of those 
normal tissues exammed, the hDS3 transcript was detected in kidney and very 
weakly in skin, but not in liver, stomach, colon, kidney or placenta. In breast 
tumors, the hDS3 transcript was detected in 4/9 cardnomas and in 1/3 
filHtNulencHnas, HD53 transcript levds being noted to be ^nular in these 5 tumors 
(data not shown). All tissue and tumor samples in which the hD53 transcript was 
detected also contained d^ectable levds of hD52 transcripts. However, the hD53 
gene appeared to be less widely expressed than hD52 at the level of sensitivity 
ofifered by Northern blot analysis, since only a proportion of those tissues 
expressnig hD52 transoipts showed detectable levels of hD53 (date not shown). 

Initial results fixim Northern blot analyses of hDS3 expres^on in breast 
cardncmia cdl lines mdicated that hDS2 transcript levds were higher in estrogen 
receptor-positive cdl lines than in those con^dered not to express the estrogen 
receptor (Byrne, J.A ei al. Cancer Res. 55:2896-2903 (1995)). Thus, we 
undertook to examine whether hDS2 and/or hDS3 transcript levds could be 
influoiced by the presence/absence of estradiol in growth media. Hybridization 
of hDS2 and hD53 probes with RNA samples firom human breast cardnoma cell 
lines indicated that rriRNAs corresponding to both genes were detectable in MCF7 
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and BT-474 cdls (i^di express the estrogen receptor), and in BT-20 cdls (v^ch 
do not express the estrogen recq)tor) (Fig, 28). However the rdative transcript 
levels for hD52 and hD53 v^^ere not identical in these cell lines, hD52 bdng 
relatively strongly expressed in BT-474 cells, and rdatively weakly caressed in 
BT-20 cdls, whereas the inverse was true for hDS3. 

In MCF7 ccDs, rwioval of estrogen from the culture medium coindded 
with reduced hD53 and hD52 transcript levds, whereas supplementation of the 
media to estradiol concentrations of lO^/lO"* M restored control hD52 or hD53 
transcript levds (Fig. 28). In the BT-474 cdl line, culturing cdls fiir 5 days in 
steroid-dq>leted media did not alter hDS2 transcript levds, and estradiol 
supplementation of dqil^ed media to 10 ' or 10^ M coindded with decreased 
hD52 transcript levds. The hD53 transcript levels were altered in BT.474 cdls 
in a different way, in that these decreased in cells cultured in estrog^depleted 
media, and wwe not restored by subsequent estradiol supplementation (Fig. 28). 
In BT-20 cdls, the presence or absence of estradiol resulted m no appredable 
changes in hD52 or hD53 transcript levds compared with 36B4 MRNA levds 
noted in the same samples (Fig. 28). 

The efifectiveness of estradiol deprivation and supplementation was 
assessed thnn^ rdiybridizing the same blots wth a probe to human pS2^ a gene 
whose transcription is directly controlled by estrogen in MC3^ cells (Brown, 
AAIC eiaL, Proa NatL Acad. ScL USA *i:6344-6348 (1984)). Levels of pS2 
MRNA have bem shown to increase for up to 3 days of estradiol treatment, by 
wUch time the magnitude of induction is as much as 30-fold (Westley, B. ei aL, 
J. BioL Chem. 25P: 10030-10035 (1984)). Accordingly, in MCF7 and BT-474 
cells, pS2 transoipt levds wwe dther low or undetected in steroid deleted 
media, \Ktoeas estradiol treatments resuhed in inductions of pS2 gene expression. 
However,/7iS2 MRNA was und^ected in estrogen receptor-negative BT-20 cdls, 
m agreement with previous findings (May, FJE.B.&Westl^^ Chem. 
2«:12901-12908 (1988)). 
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Eeducdon in hD52 or hD53 MBNA Le^h Upon Induction of 
Differentiation in Leukemic Cell lines 

Initial results fix>m Northern blot analyses had previously indicated that 
hD52 transcripts were detectable in HL.60 myelocytic leukemia cells, but not in 
K-562 proeiythroblastic I^emia cells (Byrne, J. A. ei al.^ Cancer Res. 55:2896- 
2903 (1995)), and we thus dedded to exanune the expression of hDS3 in these 
\^ same cell lines. In cells cultured under normal conditions (see Materials and 
Methods), we noted redprocal patterns of expresdon for the hD52 and hDS3 
gpnes in;fliese cdl fines, in tiiat hDS2 transcripts were detect 
I mA in K-S62 cdls, iKdiereas hD53 tnuBcripts were detect 
inHL-60 celts (Fig. 29(A) and (B)). 

The profiferative and differentiation responses of HL-60 cells and K-S62 
cells to cfaenucal agents such as TPA have been thoroughly duiracterized 
(reviewed fai. Harris, P. & Ralph, P., J. Leuk Biol 57:407-422 (1985); 
Suthcriand, J.A et oL, J. Biol Resp. Modif. 5:250-262 (1986)), with TPA 
promoting differentiation along monoc^e/macrophage pathway in both cell lines. 
Culturing ceQs in the presence of 16 nM or 160 nM TPA resulted in decreased 
hD52 transcript levels in treated HL-60 cells (Fig. 29(A)), and deo-eased hD53 
transaipt levds in treated K-562 cells (Fig 29^)), after pCTiods of 18-24 hrs. As 
a molecular control for the eflBcacy of TPA treatments, filters were rehybridized 
with a transferrin receptor cDNA insert (Kiihn, L.C. et al,. Cell 57:95-103 
(1984)), ance reduced transferrin recqjtor transcript levels have been reported for 
both HL-60 cells (Ho, P.T.C. et al. Cancer Res. -/P: 1989-1995 (1989)) and 
K-562 cells (Schonhom. J.E., J. Biol Chem. 270:3698-3705 (1995)) after TPA 
treatment. The kinetics with wtuch decreased transferrin receptor transcript levels 
were noted in TPA-treated ceUs (Fig. 29(A) and (B)) are in good agreement with 
those previously reported ^o, P.T.C. etal. Cancer Res. ^P:1989-1995 (1989); 
Schonhom, J.E., J. Biol Chem. 270:3698-3705 (1995)). Interestingly, parallel 
decreases (with respect to both thdr magnitudes and kinetics) were observed for 
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tnmsfenin recqrtor and hD52 or hD53 transcripts in HL-60 cdls (Fig. 29(A)) and 
K.562 ceUs (Fig, 29(B)), respectively. 

Discussion 

We report the cloning of a novel human cDNA termed hD53, and of the 
mouse DS2 cDNA homolog, due to the dear similarity between these sequences 
and AD52 (Bym% J.A. et aL, Cancer Res. 55:2896-2903 (1995)). The high 
ccniservatiQn of homology between h/mDS2 and hDS3 sequences, combined with 
the low levels of homology existing between these sequences and those of other 
characterized proteins, lead us to propose the existence of the novel D52 
gme^protdn fin%. The fict that mDS2 and hD52 sequences are 86% 
idendcal/91% conserved, comlnned with the pos^le esdstence of a D52 homolog 
or ancestral gene in nematodes, suggest basic cellular functions for D52 feiirity 
proteins^ which are as yet unknown. Howler, the results of sequence analyses 
and of furth»- experiments presented here have allowed us to form hypotheses 
regarding thdr functions. 

A central hD52 re^on of approximately 110 amino adds displaying 
7-amino add periocfidties of ^lar amino adds was previously identified by virtue 
of low levels of homology with cytoskeletal protdn regions (Byrne, LA. e/ot. 
Cancer Res. 55:2896-2903 (1995)). Using the so-called Lupas algorithm (Lupas, 
A. etoL, Science 252:1162-1164 (1991)), we have now identified a smgle coiled- 
coii domain m hD52^ mDS2 and hD53 towards the N-tmmmis of each protdn, 
and which is predicted to end at Leu^ in all 3 proteins. This coiled-coil domain 
overiaps with the leudne apper predicted in hD52/N8 uang helical wheel anaty^s. 
The presence of a cdled-coO domain in D52 family proteins incEcates that specific 
protdn-protein interactions are required for the fimctions) of these proteins. 
Similarly, the presence of 2 candidate PEST domains in D52 proteins indicates 
that their intracelhilar abundances may be in part controlled by proteolytic 
mechanisms. Interestingly, the extent of the N-terminally located PEST domain 
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overlaps that of the coiled-coil domain in both D52 and D53 proteins. It could 
thus be mvisaged that interactions via the coiled-coil domain could mask this 
PEST domain, in accordance with the hypothesis that PEST sequences may act 
as conditional proteolytic agnals in proteins able to form complexes (Rechstdner, 
M.,Ach. Enzyme Beg. 27:135-151 (1988)). 

At present, the cellular distribution pattern of hD53 transcripts in tissues 
is unknown and thus the agnificance of hD52 and hD53 co-expression in tissues 
cannot be evahiated. However, the results obtained for faD52 and hDS3 
expression in breast cardnoma cell lines indicate that the 2 genes may be 
expressed in the same cdl typo, with co-expres^on of hD52 and hD53 transcripts 
being demonstrated in 3/5 ceU lines exanuned(BT-20,BT-474 and MC^ Ina 
remaining 2 cdl lines (HBLIOO and ZR-75-1X only hD52 transcripts wo-e 
detectable (Byme, J.A. ei oL, Cancer Res. 55:2896-2903 (1995); Byrne, J.A., 
unpublished results), and thus hD52 may be more frequently or abundantly 
expressed than hD53 in breast carcinoma cells. Since neither hD52 nor hD53 
transcripts were detected in HFLl fibroblasts ^yme, J. A. et aJ., Cancer Bes. 
55:2896-2903 (1995); Byme, J.A., unpublished results), we thus currently 
hypothesize Aat hD53, like hD52 (Byme, J. A, et oL, Cancer Res. 55:2896-2903 
(1 995)), represents an epithelially-derived marko*. 

Estradiol stimulation/deprivation e)q>erimmts performed in MCF7 cells 
indicate that the kD52 and hD53 transcript levds normally measured in MCF7 
cdls cultured wth FCS are dependent upon estradiol. At present, the meclianism 
by which estradiol induces the accumulation of hD52 and hD53 transcripts in 
MCF7 odls is unknowa It is possible that fluctuations in hD52/hDS3 transcript 
levels may be secondary to the mitogemc eflFects of estrogen onMCF? cells, and 
not direcdy produced by estradiol per se. However, estradiol 
stimulation/dq)rivation experiments performed in a second estrogra receptor- 
positive breast cardnoma cdl line, BT-474, gave difiFerent results from tfiose 
observed in MCF7 cells. The hD52 transcript levd present in BT.474 cells 
cuhured witii FCS was not estrogen dependent, and indeed supplementing steroid- 
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deplc^ media with 1 ff' M and 1 Or* M estradiol resulted in significantly deceased 
hD52 transcript levels. Such diflFering eflfects in 2 estrogen receptor-posithre 
breast cardnoma cdl lines may indicate multiple medianisms by ^ch the 
estradiol-estiogen recqstor complex m^ influence hD52 gene expres^on in breast 
cardnoma cells, or the existence of different, ceil-spedfic&ctor5inBT-474and 
MCF7 cells iwfaich cooperate with the receptor complex in this process (Parker, 
M.G., Curr. Opin CeUBioL 5:499-504 (1993); Cavailles. V. etoL, Proc. NatL 
Acad ScL USA 97:10009-10013 (1994)). Furthermore, estradiol 
deprivation/supplementation had dififerent eEFects on hD52 and hD53 transcript 
levds in 87-474 cells. Decreased hD53 transcript levels wa^e obsCTved m cells 
cultured fiwr 5 days in steroid-depleted media, vdiether or not this media had been 
subsequoitly supplemented with estradiol for the last 3 days of culture. We 
interpret these resuks as indicating that the absence of fector(s) in the sta*oid- 
dqrfeted media resulted in decreased hD53 transcript levds, and that m this case 
the &ctor was not estradiol. 

While hD52 and hD53 were found to be co-expressed in 3/5 breast 
cardnoma cdl lines, corresponding findings in leukemic cdls confirm that co- 
ejqjresawiofthese genes is not obligatory. HL-60 cdls are mydocytic loikemia 
cells, and can be induced to differentiate dong granutocTtic or macrophage 
pathways (Harris, P. ft Ralph, P., J. Leuk BioL 57:407-422 (1985)), D^ereas K- 
562 leukemia cdls have eiythniid characteristics, and can be induced to e^ 
features diaracteristic of granulocytic, macrophagjic and m^akaryocytic 
differentiation (Sutheriand, J.A. etaL.J. BioL Resp. Modif. 5:250-262 (1986)). 
The present study has provided anotfier molecular distinction between these 2 cell 
lines, since hD52 transcripts were detected in HL-60 cells but not in K-562 cells, 
whereas hD53 transcripts were detected in K-562 cdls but not in HL-60 ceUs. 
This suggests that hD52/hD53 gene expression status may find fixture use as a 
marker to distinguish between different forms of leukemia. 

Treatment of HL-60 and K-562 cells with TPA was found to have amilar 
effects in reducing hD52 and hD53 transcript levels, respectively. This provides 
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a second example of similar regulation of gene expression for these 2 different 
genes, this tune in 2 different cell Unes, and could be considered fiirther proof of 
a fimcdcHial rdationship between the hDS2 and hDS3 gmes. The mechftnjsm by 
^di hD52 and hDS3 transcript levels are reduced in HL-60 and K-562 cdls by 
TPA treatment is cmrently unknown. It is posable that reduced hD52 or hD53 
transcript levds arise as an indirect consequence of TPA treatment, vMdti is 
known to result in a mariced cessation of prolifeiation, and an induction of 
mao^opha^c diflferentiation in both HL-60 and K-S62 cdls. However, the &ct 
that faDS2/hDS3 and transferrin recq>tor transcript levels decreased in paralld 
fesUons in TPA-treated cdls mdicates that a common stimulus might be 
re^nable for these events. 

In summary, we have demonstrated the existence of a new goie/protein 
femily, the D52 femily, which is presently comprised of D52 and D53. The 
presoice of an addic coiled-coil dommn in both D52 and D53 protdns indicates 
that spedfic protein-protein interactions may form an important component of 
D52 and D53 function. This, combined with the feet that hD52 and hD53 
transcripts are coexpressed in some human cell lines^ leads us to speculate that 
hD52 and hD53 may be able to interact m vivo. However, our observations in 
HL-60 and K-S62 cdl lines, where the 2 genes were not co-expressed judgii^ 
from Northern blot data, indicate that if mdeed hDS2 and hD53 are cellular 
partners, that tUspartnersMp is not obligatory. Other partners may exist for each 
of diese proteins, and it is tempting to speculate that under certmn conditions, the 
formation of homodimers may be fevored. 
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TABLE Vm 

Candidate PEST Domains Idaitified in hDS2» mD52 and hS^ Amino Acid Seqi 





Sequence 


Amino 
acids 


PEST domain sequence 


PEST 


5 


hD52 
inDS2 


1040 
1040 
1-37 


ETDPVPEEGEDVAATISATETLSEEEQEELE' 
KTEPVAEEGEDAVTMLSAPKALTEEEQEELR 
MEAQAQGIXETEPLQGTDEDAVASADFSSMLSEEEK 


15.8 
11.8 
5.8 




\ IiD52 
' mDS2 
liD53 


152-179 
152-185 
164-184 


igPAGODFGEVIiNrSAANASATTTEPLPEIC 

IPVVCK^DFGEVUOSTANATSIMmPPPEQMrESP* 

KVGGTNPKGGSFEEVLSSTAH 


0.6 
9.0 
-6.0 



Tosdivdy diaigedamino adds and protein tennini are underlined, vfbenas FEDS residues are shovm 
in bold Amino add residues are indicated using the erne letter code. 



\ 
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Exantpled 

Two Distinct Amplified Regions Involved at 17qllH|21 
in Human Primary Breast Cancer 

Introduction 

Gene amplification has been shown to play an important part in the 
pathogenesis and prognoius of various solid tumors including breast cancer, 
probably because overexpression of the amplified target gene confers a sdective 
advantage. The first teduiique to detect gene amplification was cytogenetic 
analysis. Thus annplificafion of several chromosomal regions, visualized as dtl^ 
extrachromosomal double nunutes (dmin) or integrated homogeneously staining 
re^ons (hsrs) are amoiig the ma|or vi^le cytogenetic abnormalities found in 
breast tumors (Gd>hart, E. el aLi Breast Cancer Bes. Treat. 125-138 (1986); 
Dutrillaux, B. eial., Cytogenet -/9:203-217 (1990)). Other techniques such as 
comparative graomic hybridization (CGH) and a novel strategy based upon 
chromosome microdissection and fluorescence in situ hybridization have also been 
applied to broad searches for r^ons of increased DNA copy nimiber in tumor 
cdIs(Guan, XY. etaL, Nat Genet »:155.161 (1994); Muleris, M. etaL, Genes 
Chrom. Cancer 70:160-170 (1994)). These diflferent techniques have revealed 
some 20 anq)Iified dvomosomal re^ons in breast tumors. These amplified 
r^ons results m S- to 100-fold amplification of a small numb^ of genes^ few of 
wAkii are thought to contribute in a dominant manner to the malignant phenotype. 
Po^onal cloning efforts be^ to identify the critical gene(s) in each amplified 
region. To date, genes documented to be amplified m breast cancers include, 
FGFRl (8pl2X MYC (8p24), FGFR2 (I0q26). CCNDl, GSJPl and EMSl (llql3), 
IGFR and75ES(15q24-q25), andERBB2 (17ql2-q21) (reviewed m, Britehe, L & 
Lidereau, R.. Genes Chrom. Cancer 14:727-251 (1995)). DNA amplification at 
s^ment ql l-q2I of chromosome 17 seems one of the most conmionly amplified 
region in human breast carrinomas. FISH, CGH and chromosome microdissection 
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shown a high increase in DNA-sequaice copy-number of this region (Kallioni^, 
O. et aL, Proa Natl Acad. ScL USA «P:5321-5325 (1992); Guan, XY. etaL. 
Nat Genet 5:155-161 (1994); Muleris, M. etaL, GenesChrom. Cancer 10:160- 
170 (1994)). Amplification of 17ql2 was ori^nafly cfiscovered in breast carcinoma 
using a probe to the ERBB2 gene (Slamon, DJ. et aL, Science 255:177-182 
(1987)). Quicldy other twnor ^es followed including cancers of the ovaiy, 
stomach and bladder, and less frequendy lung and colon cardnomas. 
bterestingly, the presmce of amplification at 17ql2-q21 has been rdated to 
clinical rdevance in breast cancer, where hidependent studies have idiown 
assodatbn with an increased risk of rdapse (Slamon, D J. et oL, Science 235: 177- 
182 (1987); Ravdin, PM & Chamness, G.C., Gene 159:19-11 (1995)). To date, 
only one gene, ERBB2, has been proposed to be responsible for the mergence of 
Has an^)licoa The JERBBi proto-oncogene belongs to the ERBB femily, the first 
idoitified member of \**ich (ERBBl) encodes the EGF (epidermal growth factor) 
receptor (Dougall, W.C. et aL, Oncogene 9:2109-2123 (1994)). ERBB2 
anq)Iification is assodated with overexpression of its product. This gene is a good 
candidate fi)r a role in breast cancer because of its transforming potency (DiRore, 
P.P. et al. Science 257:178-182 (1987)) and that transgenic mice carrying the 
ERBB2 gene show altered nuunmary cell proliferation and high incidence of 
mammary adenocarcinomas (MuUer, WJ. etaL. Cell 54:\Qi-\lS (1988)). 

AU these imdal reports empha^zed a potential role for the ERBB2 proto- 
oncogene at 17ql2-q21 in human breast carcinomas. However, fi>ur novel genes 
(caBed MLN 50, 5 1 , 62 and 64) fi:om this chromosomal i^on have recently been 
identified by a dififerential screening of a cDNA horary established fiom breast 
cancor-derived metastatic a?dllary lymph nodes (Tomasetto, C. et al.. Genomics 
28(3)361-276 (1995)), MLN 51 and MLN 64 genes showed Httle homology with 
others abeady described. MLN 62 gene (also known as CARTl or TRAF4) is a 
novd member of the tumor necrosis factor receptor-associated protein femily 
(^^m^^ ^ol. Journal of Biological Chemistry 270 (^3^:257 15-25721 (1995)), 
while MLN 50 gene (also named Lasp-l) defines a new UM protein subfemily 
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characterized by the assodation of LIM motif and a domain of region 3 Src 
homology (SH3) at the N- and C-terminal parts of the protdn, respectivdy 
(Tomasetto, C. etal.. Genomics 28(3):367-376 (1995)). 

These four genes have been found amplified and overexpressed in breast 
cancer cell lines. Therdbre, amplification of 1 7ql l-q21 DNA sequmces may be 
more complex than firstly suspected, and the number and the identity of taiget 
gene(s) ronain open questions. 

In tiie present study we have investigated a laige series of primary breast 
tumors fi3rampMcationof£Rfi82^ne and the four novel genes. Werq[>orttfaat 
25.5% of the breast tumors show amplification of one or more of these genes: 
Prdinunaiy mapping of the amplicons suggests the involvement of two distinct 
amplified regions at 17ql l-q2l in human primary breast cancer. Moreover, we 
suggest three genes (MLN 62, ERBB2 and MLN 64) as likely targets of the 
amplification event at these two chromosomal re^ons. 

Materials and Methods 

Tumor and Blood Sonnies 

Sasnpies were obtained firom 98 pximaiy breast tumors surgically removed 
fi'om patients at the Centre Rene Huguenin (France); none of tiie patients had 
undergone radiotherapy or chemotherapy. Immediately following surgery, the 
tumor sanq>les were placed in liquid nitrogen and stored at -70*^C until extraction 
of higji-molecular-wdght DNA and RNA. A blood sample was also taken fit>m 
eadi patient. 

DNAPirobes 

A pMACll? probe (a 0.8 Kb Accl firagment DNA firagment fi-om a 
genomic clone of ERBB2) was used to detect ERBB2 (ATCC No. 53408). The 
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four novd clones (MLN SO, 51, 62 and 64) were described in detail iii Tomasetto 
et oL (1995). These five probes were previously positioned and ordered by in situ 
hybridization (Tomasetto, C. eiaL, Genomics 28(3)367-376 (1995). 

For Southem-blot analysis, the control probes used were the human P- 
globin (Wilson, IT. ei oL, NucL Adds Res. 5:563-581 (1978)) and the MOS 
proto-oncogme (ATCC No. 41004). 

For Northem-blot ana^s, the control probe used was a 0.7-kb Psil 
fiagment of tfie 36B4 d>NA, as described by Masiakowsid, P. eioL, NucL Adds 
iJes: 70:7895 (1982). 

DNAAnafysb 

DMA was extracted fiT>m tumor tissue and blood leucocytes, according to 
standard methods (Maniatis, T. ei a/.. Molecular Cloning: A Laboratory 
Manual (2nd ed.. Cold Spring Hart)or, NY (1989)). Ten ^g of To^I-restricted 
DNAs were sq)arated by electrophoresis in agarose gel Oeucoqte and tumor 
DNA samples from each patient were ran in adjacent lanes), and blotted onto 
nylon membrane filters Olybond IT, Amersham Corp.), according to standard 
tecfamques. The nwmbranefihere were hybridized with nick-transit 
probes, wadied, and autoradiographed at -70'*C for an appropriate period. 

Detennma6on of DNA Amplification 

Restricticm ens^e-digested tumor DNAs were compared with matdiing 
lymphocyte DNA in the same agarose gels. Blots of these gels were first 
lq*ridized with ERBB2 and the four MLN probes. Rehybridization of the same 
blots with the MOS and the p-gjiobin probes provided a control for the amount of 
DNA transfmed onto the irjrlon membranes. The proto-oncogene and control 
gene autoradiographs were first scored by visual inspection and then determined 
by densitometry. Only the signals with an intensity of two copies or more were 
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conadered to rq)Fesent amplificatioa Amplification level was quantified by serial 
dilutions of tumor DNA to obtain a Southern hybridization signal similar to that 
obtained with leucocyte DNA samples. 

JtNAAnafysis 

RNA was extracted fiom normal and tumoral breast tissue by using the 
\ LiCl/urea method (Aufl&ay, C & Rougeon, R, Eur. 1 Biochem. 707:303-314 
(1980)), Ten miax>grams of RNA was fractionated by dectrophoresis on 1.2% 
agarose ^ containiiie 6% fbrmalddiyde, and analyzed by blot hybridization afto* 
jtransfer onto nylon membrane filtm (Hybond N, Amersham Corp.). The same 
filters were first kybricfized with ERBB2 and the four MLN nidc-translated 
labeled probes in 50% fi^rmamide at 42''C. Membranes were washed under 
stringent conditions in O.lx SSPE, 0.1% SDS at SC^C and subjected to 
autoradiogrq)hy for various paiods at -SC^C. Membranes were also rehybridized 
with a 36B4 cDNA probe corresponding to a ubiquitous RNA, The signal 
obtained was used to check the amount of RNA loaded on the gd in each 
experiment. The 36B4 signal also showed that the RNA samples were not 
extensivdy degraded. 

Evaluation of SNA (heresgfresaon 

Rdative intoiaties of the niKNA bands weie assessed by visual 
examination and confirmed by means of dm^tometry taldng the uUquitous 36B4 
bands into account. Increase in expresaon of at least 2-fold relative normal breast 
tissues expres^on were scored as positive. Overexpression was quantified by 
serial dilution of tumor RNA to obtain a Northern hybridization signal amilar to 
that obtained with normal breast tissue. 
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Besubs 

Noimal DNA (poipheral blood lymphocytes) and autologous tumor DNA 
fiom 98 breast cancer patients were screened on Southern blots for amplification 
of 5 different genes (ERBB2, MLN 50, 5 1, 62 and 64) located at 17ql l-q21. 

Amplification occurred in at least one locus in 25 of the 98 tumors 
(25.5%). 

Densitometrical analy^s revealed that amplification levds varied not only 
fi-om case to case but in some tumors also firom gene to gene. Amplification 
ras^ged fiom 2- to more than 30-fi>ld. 

17qll'^21 Aifiplicon Maps in Breast Carcinomas 

The 25 an^lified tumors were subdivided into three groups on the ba^s 
of pattern and Icvd of amplification: A, tumors with amplification of all genes 
with similar amplification levels; B, amplification of all genes with varied 
an5)Iification levds; and C, anq)lification of some of these genes. Figure 30 shows 
exanqplesoftfie most common pattoms of genetic changes. Figure 31 summarizes 
data in the form of amplification maps. 

The group A (5 cases) corresponds to the existence of a single but large 
amplicon at 17qll-q21. For these five tumors, amplification levds were always 
low (2-5x), sug^stiiig polysomies of the entire long arm of chromosome 1 7. This 
first group is not of great int^est to identify the candidate genes responsible for 
the emergence of amplicons. 

The two oih&r groups (groups B and C; 12 and 18 cases, respectively) 
show that the size and the amplification level varied firom tumor to tumor. 
Tumors T0084,T0284 and T1191 had the smaDest ampUcon involving only MLN 
62. MWth the exception of these three tumors, the amplicons in all the other 1 7 
tumors induded ERBB2 and MLN 64. Interestingly, ERBB2 and MLN 64 were 
ahvays coamplified to similar levds. In 3 cases (T0109, T1273, T15 12), these are 
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the only genes amplified at 17qll-q2l In 5 others tumors (T0391, T0183, T0309, 
T0559 and T0588) the amplicons were discontinuous between MLN 62 and the 
twolod£RaB2andMLN64. In these tumors MLN 50 showed no evidence of 
amplificatioa 

Our finding suggests the existence of two distinct amplified r^ons at 
17ql l-ql2 and I7ql2-421 in human primary breast cancer, one inchides MLN 62 
locus and the other ERBB2 and MLN 64 lod, respecth/dy. 

Expression qfERBBZ and ike Four MLN Genes in Breast Carcinomas 

Whdher the an^Bfication of ERBB2 and fte four MLN g«ies contributed 
to an elevated esqiiesston was d^ennined by comparison of RNA expression with 
DNA anqslification. This was performed on a total of 20 tumor samples for vAnch 
total RNA was available; 10 samples among the 25 tumors amplified in at least 
one locus and 10 unamplified tumors. 

Figure 32 shows examples of some overexpressed tumors, evaluated by 
Northern blot analyas. No gross alteration in the size of the mRNA was detected 
inanysanq)les. We c*seived a perfect pveriap between RNA overexpresaon and 
DNA amplification. Amplified tumors were always overexpressed for amplified 
genes, and the five genes were never overrapressed in the 10 unamplified tumor 
DNAspedmens. Despite the technical diffioihy of obt^iung quantitafive data 
fix)m Northern blot anatyses, a corrdation seems observed between levds of RNA 
and the degree of DNA amplification. The tumors with hi^ amplified levels 
showed higher mRNA levels, irrespectively of anafyzed genes. 

IXscnssion 

There are various approadies to search genes whose amplification may be 
responsible for tumorigenesis. Cytogenetic analysis, CGH and chromosome 
microdissection have allowed the localization of distinct amplified chromosomal 
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re^ons which might harbor genes contributing to tumorigenesis. Studies using 
pulsed field electrophoresis have shown that amplicons in human tumor cells 
usually comprise large regions of genomic DNA which can be up to several 
megabases in length and contam several genes (Brookes^ S. ei aL^ Genes Chrtm. 
Cancer 5:222-231 (1993)). Fme-scale molecular mapping otampMed regions is 
needed to locate such genes predsdy. Thus, coamplification of genes located in 
a liniitedc^mosomal region have been described in human tuniors. Examples 
indude the con;)lex ooanq)lification of multqile genes fixxm 1 lql3 in human breast 
cancer (Kariseder, J.etaL, Genes Chrom. Cancer 9:42-48 (1 994)) as wdl as fiom 
12ql3-ql4 in human malignant gliomas (Rdfenbeiger, G. et aL, Cancer Res. 
5-/:4299-4303 (1994)). 

Several authors observed amplification of the ERBB2 gene fiom 17ql 1- 
q21 in human breast cancer (Slamon,D J. etoL, Science 255:177-182 (1987); Ali, 
LU. etal. Oncogene Res. 3:139-146 (1988); Borg, A, etoL, Oncogene 5:137- 
143 (1991); Paterson, M.C. et aL. Cancer Res. 57:556-567 (1991)). As four 
novel genes firom this chromosomal segment have recently been identified and 
Aree of them have been found amplified and overexpressed in breast cancer cdl 
lines (Tomasetto, C. et al.. Genomics 28(3)361-316 (1995)), we decided to 
fiirtha- diaracterize the 17ql l-q21 region in breast cancer biopsies by studying 
amplification of these four novd genes, in addition to the ERBB2 goie in a large 
series of tumor DNAs. The aim was to identify fiie genes whiunOis amplification, 
to detemnne their finequenc^ and thdr level of amplification, and tfio^ to more 
predsely define the actual driver gene(s) in this amplicon(s). 

Twenty-five (25.5%) of 98 tumors showed at least one of the five genes 
amplified. Amplification of these five genes is systematically accompaiued by 
mKNA overexpres^oa However, it is also known that some tumors with single- 
copy of an oncogene may ov^express the corresponding mRNA. In the present 
study, we also examined the expression at KNA level of ERBB2 and the four 
MLN genes in 10 tumors of the breast, which do not show amplification. We did 
not observed any unamplified tumor overexpressed for these 5 tested genes. So, 
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it seem that the four MLN genes, like ERBB2 gene, could not be activated by 
mechanisms othar than gene amplification in breast cardnoma such as» for 
example, alteration of the regulatory sequence of the genes. 

In the majority of the ahffed tumors, amplification encompassed not all the 
tested lod. The two gwies most firequently anq)lified on 17ql l-qZl in our series 
were ERBB2 and MLN 64 (22.5%) which were systematically coamplified and 
overexpressed at amilar levds. The invariable coamplification of ERBB2 and 
MLN 64 seen in our study indicates that both genes are likely to be located in 
close proxhnity to each other at 17ql2-q2L In consequence, the amplification 
and consequent overexpression of MLN 64 as well as ERBB2 gene could be of 
pathogenetic sigiuficance for breast neoplastic growth. A thffd g»e, MLN 62, 
can be regarded as the possible target sdected for a second anq>licon. Thisgoie 
is located centromeric to MLN 64 and ERBB2 genes at 17qll-12. Althouj^ 
MLN 62 g»e was less fi-equently amplified (17.5%) than MLN 64 and ERBB2 
genes, it has been found with Mgh levels of amplification in most tumors which 
showed two distinct amplified re^ons at 17ql l-q2I and was the only amplified 
and overexpressed gene m three tumors (T0084, T0284 and T1191). These 
findings suggest that in some tumors amplification of MLN 62 may provide a 
sdective growth advantage. Even if the amplicons observed in our breast tumor 
series fiequently contained MLN 50 and MLN 51, the amplification maps suggest 
that these two genes are not the target genes of the amplification, they were 
invariably coamplified with MLN 64 and ERBB2 and never showed the highest 
amplification level in individual tumors. Four other ERBB2 neighboring genes 
have previously been observed coamplified with ERBB2 in 10-50% of ERBB2 
amplified tumors, inchiding THRAJ (van de Vijver, M. et aL, Mol Cell Biol 
7:2019-2023 (1987)), iMiM (Keith, W.N. e/o/., Eur. J. Cancer 29a:\A69-lAlS 
(1993)), GRB-7 (Stein, D. ei oL. EMBO J. 73:1331-1340 (1994)) and T0P2A 
(Smith, K. et aL, Oncogene «:933-938 (1993)). These four genes were never 
an^Iified alme without ERBB2 amplificatioa Our data, together with these other 
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results th^ore suggest that MLN 50 and MLN 51, as wdl as THRAl, RAM, 
GRB-7md T0P2A, are just inddentally induded in some 17ql2-q21 amplicons. 

To dat^ litde is known about the phy^ologica] and pathological fiuicticMis 
of MLN 62 and MLN 64. If MLN 64 gene showed Ittde homology with othera 
described, AffiAT 520/J77/nj4fV encodes a protein exfaibitiqg 3 domains also 
observed in the CD40-lnnding protein and in tiie tonor necroas fector (TNF) 
rcoeptor-assodated fictor 2 (rRAF2), both involved in signal transduction 
\iiiedialedbjrtheTNFreceptorfiniily. So, AdLN62/CARTJ/TRAF4 sate may he 
'involved in TNF-related cytokine signal transduction in breast cananoma. 

Bi conchjston, the presmt study shows that DNA anq)lijGaaion is 
jfiiequenffy obsenred in two (fifierent regions at 17qll-q21 in human breast cancer. 
This suggests that sevend genes in these two regions are involved in the initiation 
and/or progresaon of human breast cancer. Our prdiminary mapping of these 
17ql 1^1 amplicons in 25 amplified breast tumors shows that they consistently 
include other A4LN 62/CARTJ/TRAF4 (17qll-ql2) or MLN 64 and ERBB2 
(1 7ql2-q2I). The two new genes are good candidates for a role m breast canco- 
because,likeiSRaB2, their amplification leads to their overexpression. The main 
conduaon drawn fix>m our data is that, although ERBB2 remains a good 
candidate as one of genes under sdection in the 17ql I-q21 ampBcons, two novel 
candidategeneshavebeenidentifiedasdiivergeoesoftheseampKcons. Thus, the 
eluddation of the physiological and pathological signifkance of MLN 
62/CART1/TRAF4 and MLN 64 would confirm the invohrement of these two 
genes in breast carcinogenesis. 

It win be appreciated to those skiUed in the art that the invention can be 
performed witfam a wide range of equivalent parameters of composition, 
concentrations, modes of administration, and conditions without departing from 
the qnrit or scope of the invention or aity embodiment th«-eo£ 

The disdosure of aB refwences, patent applications and pat^ recited 
heron are hmSby incorporated by reference. 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRulc 13bis) 

A. Hie indications made below relate to tbe microorganism referred to in the description 



on page 3 , line 21 



B. IDENTIFICATION OF DEPOSIT 


Fimher deposits are identified on an additional sheet |x| 


Name of depositary institution 




AMEIOCAN T^E CULTUKE OOT.T,TCTI0N 




Address of depositary institution (includtag postal code and coutOry] 




12301 Parklawn Drive 
RcxdcvillB, Maryland 20852 
United States of iterica 




Date of deposit 

14 June 1996 


Accession Number 
KSCC 97607 



C ADDITIONAL INDICATIONS {leavcbhuA if not applleable) This infonnation is continued on an additional sheet fl 



Plasmid pBS IiD53 



D. DESIGNATED STATES FOR WmCH INDICATIONS ARE MADE Of the indieaiians arc not fi)rttttdeaffiatei Slates) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 

The indications listed belowwill be submitted to the International ButeBuUterispedfyihegenenlnatureeftheindiaaionseig^ 'Accesnon 
Number cfDqfOsit') 



For receiving Office use only 



This sheet was received with tbe international application 



Authorized officeT 



For International Bureau use only 



I I This sheet was received by the international Bureau on: 



Authorized officer 



Form PCr/RO/134 (July 1992) 
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INDICATIONS REI^TING TO A DEPOSITED MICROORGANISM 

(PCTRule ISbis) 



A. The bdications made below lelate to the microorganisro tefemd to in the description 
onpage 3 .line 20 



B- IDENTIFICATION OF DEPOSIT Further deposits are identiHed on an additional sheet Q 



Name of depositary institution 

AMEMCftN TYPE OJLTURE CQLLEXZTION 



Address of dqxksitary institution (uiduding postal code and counuy) 

12301 Parklawn Drive 
Itodcyilla, Maryland 20852 
Waited States of Itaierica 



Date of deposit 
14 June 1996 


Accession Number 
MXX 97608 


a ADDmONALINDICATIONS(/«m«Mbiiti/Mfffjy^^ This infonnation is continued on an additiona] sheet fl 



Plasmid e)BS-KU950 (Lasp-1) 



D. DESIGNATEDSTATES FORWmCHDWICATIONS ARE MADE (i/ll^M^^ 



E, SEPARATE FURNlSmNG OF INDICATIONS (/«n«6lii^ 



The indications listed below will besubmitted to the Intemationa I Bureau later (spedfyihegene^lnaUinafiheiadicttiumscJL "Auasien 
Number ^D^poafJ 



3^ 



For receiving Office use only 



This sheet was received with the international application 



Authorized officer 



For International Bureau use only 



i~l This sheet was received by the International Bureau 



on: 



Authorized officer 



Fonn PCr/RO/)34 (July 1992) 
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INDICATIONS RELATING TO A DEPOSITED NflCROORGANISM 

(PCFRule I3bis) 



A. The indications made below relate to the inicio<ir£?nisin referred lo in the description 
on page ^ ,line ^J- 



HL IDENTIFICATION OF DEPOSIT 


Further deposits are identiGed on an additional sheet \ J 


Name of depositary institution 




AMERICAN TSTFE aJLTURE OOOXECriON 




Address of depositary institution findu^ng postal code mJ amniry) 




12301 Parklawn Drive 
Itockvllle, Marylatxi 20852 
United States of i^nierica 




Date of deposit 

14 June 1996 


Accession Number 
MCC 97609 



C ADDITIONAL INDICATIONS {lasve blank ifnoi appftcabk:) Tbl: infonnation b continued on an additional sheet 



Plasmid pBS-M1764 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (tflhewdU^iaonsareiujiffralldenpiaiedSia^ 



£. SEPARATE FUIU^SHING OF INDICATIONS 



The indications listed below will be submitted to the International Bureau Uter(spedfythegamlmatur€ofiheutdiaiSioHS€.g^ 'Accessipn 
Number of Depaat^ 



0in 



For receiving Office use only 



Ibis sheet was received with the international application 



Authorized oflicer 



POT iniomaltofrt! TiMmp 



For International Bureau useoidy 



\ \ This sheet was received by the International Bureau on: 



Authorized oflicer 



Fonn PCr/RO/]34(Juty 1992) 
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INDICATIONS REUVTING TO A DEPOSITED KaCROORGANlSM 

(PCrRulel3^£r) 

A. The indications made below lelate to the microoig^nism lefemd to in the description 



on page 3 jine 21 



B. IDENTIFICATION OF DEPOSIT 


Further deposits are identified on an additional sheet 


Name of depositaiy institution 




PMEEOCM TKPB OJUJUBE OQLIECniON 




Address of depositaiy institution fuidu^iig postal code and cmuitry) 




12301 Parklawn Drive 
Bockville, Maryland 20852 
Unitaed States of Aaterica 




Date of deposit 
14 Jxme 1996 


Accession Number 

ATC3C 97610 


C ADDmONALINDICATTONS (ImeUank^iutoppUeMe) this inlbiraation is continued on an additional sheet 


Plasndd pBS-MLN62 (CAPTl) 



D. DESIGNATED STATES FOR WmCT INDICATIONS AR£MAD£(7/|j^i^^ 



E. SEPARATE FURNISHING OF INDICATIONS (itoyebtank^MioppBealpI^ 
The indications listed below win be subnutted to the Inteniational Bureau later fqi^^ 



3^ 



For receiving Office use only 



Vj Ihts sheet was received with the international application 



Authorized ofllcer 

c.wisianw ^^^^^^^ 
POT infiomaHond DMsMI 



For International Bureau use only 



I i This sheet was received by the International Bureau < 



Authorized officer 



Fonn rCT/KO/l>4(July 1992) 
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INDICATEONS RELATING TO A DEPOSITED MICROORGANISM 

(PCX Rule I3bis) 



A. The indications made below relate to tbe microorganism referred to in the description 
on page 3 .line 21 



B. IDENTIFICATION OF DEPOSIT Further deposits are identified on an additional sheet | | 



Nameof depositary institution 

AMSEOCAN TXPE CULTURE OOLLEX^TION 



Address of depositary institution (includitig postal code ond coonuyi 

12301 pWklawn Drive 
Rockvill^^ Marylaryl 20852 
United States of Aonerica 



Date of deposit 




Accession Number 


14 June 1996 




MOC 97611 



C* ADDniONAL INDICATIONS flemvfrbnlbiyjiofaivili^ This information is continued on an additional sheet PI 



Plasmid pBS-MEliSl 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE fif the uidkaions moot for all den ffuStdStaia) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 

The indications listedbelow will be submitted to the International Bureau later (specifyihegenentliuaureqftheinJicatioase.g^ 'Accesston 
NuadterofDqmi^ 



z 



For receiving Office use only 



Ibis sheet was received with the international application 



Authorized officer 



For International Bureau iise only 



{ I This sheet was received by tbe International Bureau on: 



Authorized officer 



Form PCr/RO/134 (July 1992) 
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Whai Is Chimed Is: 

I. An isolated nuddc add molecule comprising a polynucleotide 
selected fiom the group consisting of: 

(a) a polynudeotide encoding a polypeptide having an amino 
add sequence as shown in Figure 6 (SEQ ID NO:2), Figure 14 (SEQ IDN0:4X 
Figure 16 (SEQ ID N0:6), Figure 21 (A-D) (SEQ ID N0:8), or Figure 24(B) 
(SEQIDNOrlO); 

(b) a polynudeotide encocting a polypeptide having an amino 
add sequence as encoded the cDNA contained in ATCC Depodt No. 97610, 
97608. 97609, 9761 1, or 97607; 

(c) a polynudeoddehaviqg a nudeotide sequence at least 90% 
identical to the nudeotide sequence of the polynucleotide of (a) of (b); 

(d) a polynucleotide that hybridizes under stringent conditions 
to any of the polynucleotides of (aHc) or the complement tii&eol^ 

(e) a polynudeotide fragmoit of any of the polynucleotides of 
(aXd), wherein said fragment is at least 1 5 bp in lengtii; and 

(f) a polynudeotide having a nudeotide sequence 
conq)lementaiy to the imdeotide sequence of any of the polynudeotides of (aHe). 

2. The isolated nucldc add molecule of daim 1, ygtAich is a DNA 
molecule. 

3. The isdated nuddc add molecule of daim 1, which is an m vUro 
RNA transcript 



4. The isolated nudeic acid molecule of daim 2, wherdn said 
po^deotide is cDNA. 
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5. An isolated nucleic acid molecule comprising a nucleic add 
sequence encoding any one of the MLN 64 variants A-G disclosed in Table VI. 

6. A method for making a recombinant vector comprising inserting 
the isolated nuddc add molecule of claim 1 into a vector. 

7. A recombinant vector produced by the method of claim 6. 

8. A method of making a recombinant host cdl conqiriamg 
introducing the recombinant vector of claim 7 into a host cefl. 

9. A recombinant host cell produced by the method of daim 8. 

10. A recombinant method for producing a polypeptide comprising 
cuhuring the recombinant host cell of claim 9. 

11. An isolated polypqptide selected from the group conmsting ot 

(a) a pofypqptide having the amino add sequence as ^own in 
Figure 6 (SEQ ID NO:2), Figure 14 (SEQ ID NO:4), Figure 16 (SEQ ID NO:6), 
Figure 21 (A-D) (SEQ ID NO:8), or Figure 24(B) (SEQ ID NO:10); 

(b) a polypeptide having the amino add sequence as encoded 
by the cDNA deposited ATCC Depoat No. 97610, 97608, 97609, 97611, or 
97607; 

(c) a pplypqitide having an amino add sequence at least 90% 
identical to the polypeptide of (a) or (b); and 

(d) a polypeptide fragment of any one of (aXc), wherdn said 
fragment is at least IS amino adds in length. 

12. An antibody spedfic for an isolated polypeptide of claim 1 1 . 
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13. Anisolatedpolypeptidei^duchisany oneoftheNlLh^ 64 va^ 
A-G disclosed in Table VI. 

14. A method useful during breast cancer prognosis, comprising: 

(a) assaying a first MLN 50, SI, 62 or 64 gene cqnession levd 
or gene copy number in breast cancer tissue; and 

(b) comparing said first gene expresaon level or gene copy 
number with a second MLN 50, 51, 62 or 64 gene expression level or gene copy 
number, v/bmby the couq[)arison of said first gene expression levd or gene copy 
number to said seccmd gene esqunesaon levd or gene copy mimb^ is a prognostic 
marker fi)r breast cancer. 



15. The method of daim 14, whoein said second goie expression levd 
or gene copy number is assayed in non-tumorigenic breast tissue. 

16. The method of daim 14, wherein said second gene expres^on levd 
or gene copy number is assayed in tumorigenic breast tissue. 

17. The method of daim 14, wherdn said gene expression levd is 
assayed by detecting MLN 50, 5 1 , 62 or 64 protdn witfi an antibody. 

18. The method of daim 14, wherein said gene expression levd is 
assayed by detecting MLN 50, 5 1 , 62 or 64 mRN A 

1 9. The method of daim 14, >^eidn said gene copy number is assayed 
by peifijrming or detecting extrachromosomal double minutes (dmin), integrated 
homogeneously staining regions (hsrs), comparative genomic hybridization 
(CGH), or fluorescence in situ hybridization. 



wo 97/06256 



PCT/US96/12500 



-151- 

20. A method for distinguishing between leukraiia cdls with 
myelocytic or eiythroid characteristics, comprising: 

assaying leukemia cells for DS2 or DS3 gene expression, whereby the 
pres^ice of D52 gene e9q>resdon or the lack of D53 g^ expression indicates that 
the leukemia cells have myeloc^c characteristics and the presence of DS3 gene 
expresaon or the lade of D52 gene expression indicates that the leukemia ceils 
have eiythroid characteristics. 
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CCGaAGCG(XXCTC(m«XCGOG(»:TGT(»^^ 
aXXXTaXOGCCGCTCMOGOMGCCTax:!!^^ 

^ MPGFDYKFLEKP 

/wG0(y«xxx:TaT(p6wclGTC^^ 

K__R R I L I C P L C G K P M - " • - - 



K._R__R I L I c PLCGKPMREP_VQVS 
ACCTGCGG(XmGTTTCTGCGATWX:TGCCTGCAGGAGTTCCT(XTGA«G« 
T C G H R FCDTCLQEFLSEGVF 
AAGTGCCCTGAGGAqWGCTTaTCTGGACTATOXyUtf^ 
" " OjQ L P L D YAKIYPDPEL 
rTGGGCCTGCCTATCCGC|yGCATCCAC«TG/iGG/«XXa:TGCCGeTG^ 



JL_L_L_LJ_JU L P L 

GAAGTACAAGTATTGGGCCTGCCTATI 
P u n u I r I D I 



LAflbUiliUiU;TCA«;TGCGAGnTTGTGGCT6TGACTT(^ 
K__R__R_J...K CEFCGCOFSGEA 
.^.«™a:ATGAGGGTATGTGCaXX:AaAGAGTGTCTACTGTBAGAATA«7GTGGT 
YESHEGMCPQESVYCENKCG 
GCamTGATGCGGGKX:TGCTGGCC(yiGCATGCCACCTCTGAGTG(^ 
ARMMRGLLAQHATSECPKRT 
CAGCCCTGCACCTACTGCACTA«GAGTTaTCTTTGACAayVT(X:A5^^ 
QPCTYCTKEFVFOTIOSHQY 
CAGTGCCCAAGGCmTGTtGCCTGCCCCAACCAATGTGGTGTGGGCA^ 
OCPRLPVACPNOCGVGTVAR 
GAGGACCTGttAGGCCATCTGAAGGACAGCTGTAACAOmCTGGTGCTCTa^ 
EDLPGHLKDSCNTALVLCPF 
AAAGACTCCGGCTGCA/mCAreTGaCTAAGCTGGCAATGG(^^ 
KDSGCKHRCP K L A M A R H V F F 
AGTGTGAAGa»CATCTGGCCATGATGTG7|SCCCTG6TC/l6C0^^ 
SVKPH lAMMCk L V S R Q R Q E L 
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AACTGGAAGAATTTCCAGAAGCCAGGCACGT(X;CGGGGCTCCCTG^^^ 
NWKNFOKPGTWRGSLDESSL 
GGCTTTGGTTATCCCAAGTTCATCTCCCACCAGGACATTCGAAAGCGAAACTATGTGCGG 
GFGYPKFISHQDIRKRNYVR 



1380 
432 
1440 
452 



GATGATGCAGTCTTCATCaTGCmGTTGAACTGaX^ 1500 



D D A V F I R A A V E L P R K I L S l > 470 

CAGGTGGGGTTCGAGGGGAAAGGACGATGGGGCATGACCTCAGTCAGGCACTGGCTGAAC 1560 

TTG6AGAGGGGGCCGGACCCCCGTCAGCTGCTTCTGCTGCCTAGGTTCTGTTACCCCATC 1620 

CTCCCTCCCCCAGCCACCACCCTCAGGTGCCTCCAATTGGTGCTTCAGCCCTGGCCCCTG 1680 

TGGGGAACAGGTCTTGGGGTCATGAAGGGCTGGAAACAA6TGACCCCAGGGCCTGTCTCC 1 740 

CTTCTTGGGTAGGGCAGACATGCCTTGGTGCCXXJTCACACTCTACACGGACTGAGGTGCC 1800 

TGCTCAGGTGCTATGTCCCAAGAGCCATAAGGGGGTGGGAATTGGGGAGGGAGAAAGGGT 1860 

AGTTCAAAGAGTCTGTCTTGAGATCTGATTTTTTCCCCCTTTACCTAGCTGTGCCCCCTC 1920 

TGCTTATTTATTTCCTTAGTGCCAGGAGGGCACAGCAGGGGAGCCCTGATTTT TAATAAA 1980 

TCCGGAATTGTATTTAnAAAAAA 2004 



FIG.6B 
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CAG(»;CGGAAGTGGCGCT(^AAGATCTTCTTCCa:TCTG^ 60 

CGGAGaX»ACTGC(X;TTGG(X;CGGGAAGAGaGG(X;CCGT(^^ 
TGCTGCT6AGGCa;CGCCCTC(X:CGCCCTGy^TGgG^ 

M S K L 4 

CgmGAGCTGACgOG^Agja 240 

PRELTRDLERSLPAVASLGS 24 
O 

TCACTGTCgC/OGCCAGAGa:TCTCCTCG(^ 30o 



SLSHSQSLSSHLLPPPEKRR 
GCCATCTCTGATGT(X»(XXX:ACCTTCTGTCTCTTCGTCAa:TTaACCTGCTCTT^^^ 



120 
180 



44 

360 
64 



A I S D V R R T F C L F V T F D L L F I 

f 

TCCCTGCTCTGGATCATCGAACTGAATACCAACACAGGCATCCGTAAGAACTTGGAGCAG 420 

SLLWIIELNTNTGIRKNLEO 84 

GAGATCATCCAGTACAACTTTAAAACTTCCTTCTTCGACATCTTTGTaTGGCCTT^^ 480 

EIIOYNFKTSFFDIFVLAFF 104 

CGCTTCTCTGGACTGCT(X:TAGGCTATGCaTGCTGCAGCT(XXmCT^TGGG^ 540 

R F S G L L L G Y A V L Q L R H W W V I 124 

GC^TCACXJACGCTGGTGTa^AGTGCATTaTCATTGTCAAGGTCATCCTCTCTGA^ 600 

AVTTLVSSAFLIVKVILSEL 144 

CTCAGCAAAGGGGCATTTGGCTACCTGCTCCCCATCGTCTCTTTTGTCCTCGCCTGGTTG 660 

LSKGAFGYLLPIVSFVLAWL 164 

GAGACCTGGTTCCTTGACTTCAAAGTCCTACCCCAGGAAGCTGAAGAGGAGCGATGGTAT 720 
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ETWFLDFKVLPQEAEEERWY 184 

CTTGCaCTCAaTTpCTGnG 780 

LAAQVAVARGPLLFSGALSE 204 

GG/CAGTTCTAJTCACC^^ 840 

6QFYSPPESFAGSDNESDEE 224 

GTTGCTGGGAAGAAAAGTTTCTCTGCTCAGGAGCGGGAGTACATCCGCCAGGGGA^ 900 

VAGKKSFSAQEREYIRQGKE 244 

6CCA(m:AGTGGTGGACCABATCTTGGC(Xy«JAAGAGAACTGGAAGTTTGAGAAGW^^ 960 

ATAVVDQI LAQEENWKFEKN 264 

T 

AATGAATATGGGGACACCGTGTACACCATTGAABTTCCCTTTCACGGCAAGACGTTTATC 1020 

NEYGDTVYTIEVPFHGKTFr 284 

CTGAAGACCTTCCTGCCCTGTCCTGCGGAGCTCGTGTACCAGGAGGTGATCCTGCAGCCC 1080 

LKTFLPCPAELVYQEVILQP 304 

GABAGGATGGTGCTGTGGAACAAGACAGTGACTGCCTGCoJATCCTGCAGre^ 1 1 40 

ERMVLWNKTVTACQILQRVE 324 

GACAACACCCTCATCTCCTATGACGTGTCTGCAGGGGCTGCGGGCGGCGTGGTCTCa^ 1 200 

DNTLISYDVSAGAAGGVVSP 344 
? 

AGGGACTTCGTGAATGTCCGGCGCATTGAGCGGCGCAGGGACCGATACTTGTCATCAGGG 1 260 

ROFVNVRRIERRRDRYLSSG 364 

ATCGCCACCTCACACAGTGCCAAGCCaCGACGCACAAATATGTCcLGGAGAGAATGGC 1 320 

lATSHSAKPPTHKYVRGENG 384 

FIG.16B 
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CCT(XX»;CTTCATa;TGCTCAAGTCGa:CAGTAAC(XXX»TGTTT(X^ 1380 

PGGFIVLKSASNPRVCTFVW 404 

ATTCTTAATACAGATCTCAA(^CCGCCTGCC(XG6TACCTCATCCACCAG^ 1 440 

ILNTDLKGRLPRYLIHOSLA 424 

GCCACCATGTTTGAATTTGCCnTCAOITGCGACAGCGCATCAGrcAGCTGGGGGCC^ 1500 

ATMFEFAFHLRQRISELG, AR 444 

GCGTGACTGTGCC(XX:TCCCACCX:TGCGGGCCAGGGTCCTGTCGCCACW^ 1560 

A • 445 



CAGAAAGGGTGCCAGTTGGGCTCGCACTGCCCACATGGGACCTGGCCCCAGGCTGTCACC 1620 

CTCCACCGAGCX:aCGCAGTGCCTGGAGTTGACTGACTGAGCAGGCTGTGGGGTGGAGCAC 1 680 

TGGACTCCGGGGCCCCACTGGCTXAGGAABTGGGGTCTGGCCTGTTGATGTTTACATGG 1 740 

CGCCCTGCCTCCTGGAGGACCAGATTGCTCTGCCCCACCTTGCCAGGGCAGGGTCTGGGC 1800 

TGGGCACCTGACTTGGCTGGGGAGGACCABGGCCCTGGGCAGGGCAGGGCAGCCTGTCAC 1860 

CCGTGTGAAGATGAAGGGGCTCTTCATCTGCCTGCGCTCTCGTCGGTTTTTTTAGGATTA 1920 

TTGAAAGAGTCTGGGACCCTTGTTGGGGAGTGGGTGGCAGGTGGGGGTGGGCTGCTGGCC 1 980 

ATGAATCTCTGCCTCTCCCAGGCTGTCCCCCTCCTCCCAGGGCCTCCTGGGGGACCTTTG 2040 

TATTAAGCC AATTAAAA ACATGAATTTAAAAAA 2073 



FIG.16C 
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6AATTCCGTT 6CT6TC6CAC ACACACACAC ACACACACAC ACACCCCAAC ACACACACAC 60 
ACACCCCAAC ACACACACAC ACACACACAC ACACACACAC ACACACACAC ACACAGCGGG 120 
ATG6CC6AGC GCC6CAC6C6 TAGCAC6CCG GGACTAGCTA TCCAGCCTCC CAGCAGCCTC 180 



TGCGACGG6C GC6GTGCGTA NGTACCTCGC CG6TGGTG6C CGHCTCCGT AAG AT6 236 
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AAT CCA GCA TAG ATA CCT CGG AAA ^ CTC TTC 111 GAG CAT GAT CTT 812 
Asn Pro Ala Tyr He Pro Arg Lys 61y Leu Phe Phe Glu His Asp Leu 

180 185 190 

CGA G6G CAA ACT CAG GAG GAG 6AA GTC AGA CCC AAG GGG CGT CAG CGA 860 
Arg Gly Gin Thr Gin Glu Glu Glu Val Arg Pro Lys Gly Arg Gin Arg 

195 200 205 

AAG CTA TGG AAG GAT GAG GGT CGC TGG GAG CAT 6AC AAG TTC CGG GAA 908 
Lys Leu Trp Lys Asp Glu Gly Arg Trp Glu His Asp Lys Phe Arg Glu 
210 215 220 225 

GATfAG CAG GCC CCA AAG TCC CGA CAG GAG CTC ATT GCT CTT TAT GGT 956 
Asp e(lu €ln Ala Pro Lys Ser Arg Gin Glu Leu He Ala Leu Tyr Gly 

230 235 240 

TAT GAC AH CGC TCA GCT CAT AAT CCT GAT 6AC ATC AAA CCT CGA AGA 1004 
Tyr Asp He Arg Ser Ala His Asn Pro Asp Asp He Lys Pro Arg Arg 

i 245 250 255 

ATC CGG AAA CCC CGA TAT GGG AGT CCT CCA CAA AGA GAT CCA AAC TGG 1052 
He Arg Lys Pro Arg Tyr Gly Ser Pro Pro Gin Arg Asp Pro Asn Trp 

260 265 ?70 

AAC GGT GAG CGG CTA AAC AAG JCJ CAT CGC CAC CAG GGT CTT GGG GGC 1100 
Asn Gly Glu Arg Leu Asn Lys Ser His Arg His Gin Gly Leu Gly Gly 

275 280 285 

ACC CTA CCA CCA AGG ACA TTT AH AAC AGG AAT GCT GCA GGT ACC GGC 1148 
Thr Leu Pro Pro Arg Thr Phe He Asn Arg Asn Ala Ala Gly Thr Gly 
290 295 300 305 

CGT AT6 TCT GCA CCC AGG AAT TAT TCT CGA TCT GGG GGC HC AAG GAA 1196 
Arg Met Ser Ala Pro Arg Asn Tyr Ser Arg Ser Gly Gly Phe Lys Glu 

310 315 320 

GGT CGT GCT GGT HT AGG CCT GTG GAA GCT GGT GGG CAG CAT GGT GGC 1244 
Gly Arg Ala Gly Phe Arg Pro Val Glu Ala Gly Gly Gin His Gly Gly 

325 330 335 

CGG TCT GGT GAG ACT GTT AAG CAT GAG ATT AGT TAC CGG TCA CGG CGC 1292 
Arg Ser Gly Glu Thr Val Lys His Glu He Ser Tyr Arg Ser Arg Arg 

340 345 350 

CTA GAG CAG ACT TCT GTG AGG GAT CCA TCT CCA GAA GCA GAT GCT CCA 1340 
Leu Glu Gin Thr Ser Val Arg Asp Pro Ser Pro Glu Ala Asp Ala Pro 

355 360 365 

GTG CTT GGC AGT CCT GAG AAG GAA GAG GCA GCC TCA GAG CCA CCA GCT 1388 
Val Leu Gly Ser Pro Glu Lys Glu Glu Ala Ala Ser Glu Pro Pro Ala 
370 375 380 385 

GCT GCT CCT GAT GCT GCA CCA CCA CCC CCT GAT AGG CCC ATT GAG AAG 1436 
Ala Ala Pro Asp Ala Ala Pro Pro Pro Pro Asp Arg Pro He Glu Lys 
390 395 400 

FIG.21B 



SUBSTITUTE SHEET (RULE 26) 



1VO 97706256 



PCT/US96/12500 



30/45 



AAA TCC 


TAT 


TCC 


CGG 


GCA 


AGA 


AGA 


ACT 


CGA 


ACC 


AAA 


GTT GGA GAT GCA 


1484 


Lys Ser 


Tyr 


Ser 


Arg 


Ala 


Arg 


Arg 


Thr 


Arg 


Thr 




Val Gly Asp Ala 








405 










410 








415 




6TC AAG 


CTT 


GCA 


GAG 


GAG 


GTG 


CCC 


CCT 


CCT 


CCT 


GAA 


GGA CTG ATT CCA 


1532 


Val Lys 


Leu 


Ala 


Glu 


Glu 


Val 


Pro 


Pro 


Pro 


Pro 


Glu 


Gly Leu He Pro 






420 










425 










430 




GCA CCT 


CCA 


GTC 


CCA 


GAA 


ACC 


ACC 


CCA 


ACT 


CCA 


CCT 


ACT AAG ACT GGG 


1580 


Ala Pro 


Pro 


Val 


Pro 


Glu 


Thr 


Thr 


Pro 


Thr 


Pro 


Pro 


Thr Lys Thr Gly 




435 










440 










445 




ACC TGG 


aAA 


GCT 


CCG 


GTG 


GAT 


TCT 


AGT 


ACA 


AGT 


GGA 


CTT GAG CAA GAT 


1628 


Thr Trp 


61 u 


Ala 


Pro 


Val 


Asp 


Ser 


Ser 


Thr 


Ser 


Gly 


Leu Glu Gin Asp 




450 








455 










460 




465 




GTG GCA 


CAA 


CTA 


AAT 


ATA 


GCA 


GAA 


CAG 


AAT 


TGG 


AGT 


CCG GGG CAG CCT 


1676 


Val Ala 


Gin 


Leu 


Asn 


He 


Ala 


Glu 


Gin 


Asn 


Tro 


Ser 


Pro Glv Gin Pro 










470 










475 










TCT 7TC 


CT6 


CAA 


CCA 


CGG 


GAA 


CTT 


CGA 


GGT 


ATG 


rrr 


AAP PAT ATA PAP 

Mr\U V/Ml MiM VrMU 




Ser Phe 




Gin 


Pm 


Am 


Gl ij 


1 Pll 


Am 


Glv 


Mpt 


Pm 


Moll n 1 o lie n iS 






























ATG GGA 






PPT 


PPA 


CCT 


CAG 


TTT 


AAC 


CGG 


ATG 


GAA GAA ATG QC 


iltZ 


Met 61 y 


Ala 


Glv 


Pro 


Pro 


Pro 


Gin 


Phe 


Asn 


Arg 


Met 


Glu Glu Met Leu 






500 










505 










510 




ACT TTG 


CAA 


ATA 


TCC 


ATT 


AAA 


TAC 


CTG 


CCA 


TGT 


ACC 


AAG TGT TTT TCA 


1820 


Thr Leu 


Gin 


He 


Ser 


He 


Lys 


Tyr 


Leu 


Pro 


Cys 


Thr 


Lys Cys Phe Ser 




515 










520 










525 






ACA CCT 


AAA 


GGA 


AGG 


TAG 


GACnCATAT GAGAGCCCTC TAGAAHCTT 


1868 


Thr Pro 


Lys 


Gly 


Arg 


* 



















530 535 



ATTGTTTAGG 


CCTCTTTCTT 


TGTCTCAGGG 


TGTCCAGGGT 


GTCCAGGGTG 


GTCGAGCCAA 


1928 


ACGCTATTCA 


TCCCAGCGGC 


AAAGACCTGT 


GCCAGAGCCC 


cccGcccac 


CAGTGCATAT 


1988 


CAGTATCATG 


GAGGGACATT 


ACTATGATCC 


ACTGCAGTTC 


CAGG6ACCAA 


TCTATACCCA 


2048 


TGGTGACAGC 


CCTGCCCCGC 


TGCCTCCACA 


GGGCATGCTT 


GTGCAGCCAG 


GAATGAACCT 


2108 


TCCCCACCCA 


GGTrTACATC 


CCCATCA6AC 


ACCA6CTCCT 


CTGCCCAATC 


CAGGCCTCTA 


.2168 


TCCCCCACCA 


GTGTCCATGT 


aCCAGGACA 


GCCACCACCT 


CAGCAGTTGC 


TTGCTCCTAC 


2228 


TTACTTTTCT 


GCTCCAGGCG 


TCATGAACn 


TGGTAATCCC 


AGHACCCn 


ATGCTCCAGG 


2288 



FIG.21C 

SUBSTITUTE SHEET (RUU 26) 



"^Ofnmise pcr/uswi/iisoo 

31/45 

GGCACT6CCT CCCCCACCAC CGCCTCATCT 6TATCCTAAT ACACAGGCCC CATCACAGGT 2348 
ATATGGAGGA GT6ACCTACT ATAACCCCGC CCAGCAGCAG GTGCAGCCAA AGCCCTCCCC 2408 
ACCCCG6AG6 ACTCCCCAGC CAGTCACCAT CAAGCCCCCT CCACCTGAGG HGTAAGCAG 2468 
GGGHCCAGT TAATACAAGT TTCT6AATAT TTTAAATCn AACATCATAT AAAAAGCAGC 2528 
AGAGGf GAGA ACTCAGAAGA GAAATACAGC TGGCTATCTA CTACCAGAAG G6CTTCAAAG 2588 
ATATAG6GTG T6GCTCCTAC CAGCAAACAG CTGAAAGAGG AGGACCCCTG CCnCCTCTG 2648 
AGGACAGGCT CTAGAGA6AG GGAGAAACAA 6TGGACCTCG TCCCATCHC ACTCTTCACT 2708 
TGAGHGGCT 6TGTTCGGGG GAGCAGAGAG AGCCAGACAG CCCCAAGCTT CTGAGTCTAG 2768 
ATACAGAAGC CCATGTCnC TGCTGHCn CACnCTGGG AAA7TGAAGT GTCnCTGH 2828 
CCCAAGGAAG aCCTTCQG TTTGTTTTGT TTTCTAAGAT GnCATTTTT AAAGCCTQGC 2888 
nCTTATCCT TAATAnATT TTAATmrT CTCTTTGTTT CTGTTTCnG CTCTCTQCC 2948 
CTGCCTTTAA ATGAAACAA6 TCTAGTCTTC TGGTTTTCTA GCCCCTaCG ATTCCCTTTT 3008 
GACTCnCCG TGCATCCCAG ATAAT6GAGA ATGTATCAGC CAGCCHCCC CACCAAGTCT 3068 
AAAAAGACCT GGCCTTTCAC TTTTAGnGG CATTTGTTAT CCTCTTGTAT ACTTCTATTC 3128 
CCHAACTCT AACCCTGTGG AAGCATGGCT GTCTGCACAG AGGGTCCCAT TGTGCAGAAA 3188 
AGCTCAGAGT A6GTG6GTAG GAGCCCncr CmGACTTA GGTimAGG AGTCTGAG^ 3248 
TCCATCAATA CCTGTACTAT GATGG6CTTC TGHCTCTGC TGAGGGCCAA TACCCTACTG 3308 
TGGGGAGAGA TGGCACACCA GATGCTmG TGAGAAAGGG ATGGTGGAGT GAGAGCCTTT 3368 
GCCTTTAGGG GTGTGTATTC ACATAGTCCT CAGGGCTCAG TCTTTTGAGG TAAGTG6AAT 3428 
TAGAGGGCCT TGCnCTCTT CTTTCCAnC nCTTGCTAC ACCCCTTTTC CAGnGCTGT 3488 
GGACCAATGC ATCTCTTTAA AGGCAAATAT TATCCAGCAA GCAGTCTACC CTGTCCTTTG 3548 
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CMTTGCTCT TCTCCACGTC TTTCaGCTA CAAGT6TTTT AGATGHACT ACCnATTTT 3608 

CCCC6AATTC TATTTrTGTC OTGCAGACA 6AATATAAAA ACTCCTGGGC TTAAGGCCTA 3668 

AGGAAGCCAG TCACCTTCTG 6GCAAGG6CT CCTATCTTTC CTCCCTATCC ATGGCACTAA 3728 

ACCACTTCTC TGCT6CCTCT GTGGAAGAGA TTCCTAnAC TGCAGTACAT ACGTCTGCCA 3788 

GGGGTAACCT GGCCACTGTC CCTGTCCTTC TACAGAACCT GAGGGCAAAG ATGGTGGCTG 3848 

TGTCTCTCCC CGGTAAT6TC ACTGTTTnA TTCCTTCCAT CTAGCAGCTG GCaAATCAC 3908 

TCTGAGTCAC AGGTGTGGGA TGGAGAGTGG 6GAGAGGCAC TTAATCTGTA ACCCCCAAG6 3968 

AGGAAATAAC TAAGAGATTC TTCTAGGGGT AGCTGGTGGT TGTGCCTTTT GTAGGCTGTT 4028 

CCCITTGCCT TAAACCTGAA GATGTCTCCT CAAGCCTGTG GGCAGCATGC CCAGATTCCC 4088 

AGACCTTAAG ACACTGTGA6 AGTTGTCTCT GTTGGTCCAC TGT6TTTAGT TGCAAGGATT 4148 

T7TCCATGT6 TGGTG6TGTT TnTGTTACT GTTTTAAAGG GTGCCCATTT GTGATCAGCA 4208 

nGTGACTTG GAGATAATAA AATTTAGACT ATAAACTTGA AAAAA 4253 

FIG.21E 



SUBSTITUTE SHEET (RUU 2S) 



wo 97106256 



3 3/45 



PCT/US96/12500 




wo 97/06256 



PCTAJS96/12500 



34/45 




SUBSTITUTE SHEET (RULE 26) 



W097/Dfi256 



PCTAJS96/I2500 



35/45 



1 CAGMGCGGCTAGTG6CGGCT6CCT6C6TCCCCMCCCCCTCCGCGCA6C6CTCGC&ACA 60 

61 CGCGTGCCAGGAGTGGGAGCGAGCGGCGGGGCCAGCTGCGnCTGAGCCTGGGCGCAGC 120 

121 GCCATCTGCTCTGGGAAGCACCAGGGTGTCCCCGCCGCCCTCAGCTC6AAGTCAGCCACC 180 

.181 ATGGAGGCGCAGGCACAAGGTnGTTGGAGACTGAACCGnGCAAGGAACAGACGAAGAT 240 

1 MEAQAQGLLETEPLQGTDED 20 

241 GCAGTAGCCAGTGCTGACnCTCTAGCATGCTCTCTGAGGAGGAAAAGGAAGAGnAAAA 300 

21 AVASADFSSMLSEEEKEELK 40 

301 GCAGAGTTAGTTCAGCTAGAAGACGAMnACAACACTACGACAAGTTTTGTCAGCGAAA 360 

41 AELVQLEDEITTLRQVLSAK 60 

361 GAAAGGCATCTAGTTGAGATAAAACAAAAACTCGGCATGAACCTGATGAATGAAnAAAA 420 

61 ERHLVEIKQKLGMNLMNELK 80 

421 CAGAACnCAGCAAAAGCTGGCATGACATGCAGACTACCACTGCCTACAAGAAAACACAT 480 

81 QNFSKSWHDMQTTTAYKKTH 100 

481 GAAACCCTGAGTCACGCAGGGCAAAAGGCAACTGeAGCTTTCAGCAACGnGGAACGGCC 540 

101 ETLSHAGQKATAAFSNVGTA 120 

541 ATCAG(W\GAAGTTCGGA6ACATGAGnACTCCATTCGCCATTCCATAAGTATGCCT6CT 600 

121 I S K K F 6 D M S Y S I R H S I S M P A 140 

601 ATGAGGAATTCTCCTACTnCAAATCAmGAGGAGAGGGTTGAGACAACTGTCACAAGC 660 

141 MRNSPTFKSFEERVETTVTS160 

661 CTCAAGACGAAAGTAGGCGGTACGAACCaAATGGAGGCAGTTTTGAGGAGGTCCTCAGC 720 

161LKTK VG6TNPNGGSFEEVLS 180 

721 TCCACGGCCCATGCCAGTGCCCAGAGCnGGCAGGAGGQCCCGGCGGACCAAGGAGGAG 780 

181 STAHASAQSLAGGSRRTKEE 200 

781 GAGCTGCAGTGCTAAGTCCAGCCAGCGTGCAGCTGCATCCAGAAACCGGCCACTAa^ 840 

201 V E L Q C * 204 

841 CCCATCTCTGCCTGTGCnATCCAGATAAGAAGACCAAAATCCCGCTGGGAAAAACCCAG 900 

901 GCCnGACAnGnAnCAAATGGCCCCTCCAGAAAGmAATGATnCCATTTGTATrr 960 

961 GTCnGATGATG6ACCA(TrGACCATCACAmCAGTATTCATAGATGACTGTCACATrr 1020 

1021 TAAAA TGnCC CACTTGAGCAGGTACACAACTGGTCATAATTCCTGTCTGTGTAATTCGA 1080 

1081 TGTATATTmCCAAACATGTAGCTAnGmGCTnGATTmGCTTGGCCTCCTnAT 1140 

1141 GATGTGCATGTCCTTGAAGGCTGAATGAACAGTCCCTrrCAGTTCAGCAGATCAACAGGA 1200 

1201 TGGAGCTCnCATGACTGTCTCCAGCAATAGGATGATnACTATAAATnCATCCAACTA 1260 

1261 CnGTGATCTCTCTCACCTACATCAAmTGTATGTTAAmCAGC AATTAAAA RMTTR 1320 

1321 ATTTTAAAAAAAAAAAAAAAAAAAAAA 1347 
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1 CQGGAGCGAGGreGCTCAGACATGGACCGCGGCGAGCMGGTaGCTGAAGACAG^ 60 

1 MDRGEQGLLKTEP13 

61 GTGGCCGAGGAAGGAGAGGATGCTGnACCATGCTCAGTGCTCCAGAGGCGCTGACGGAA 120 

14 VAEEGEDAVTMLSAPEALTE 33 

121 GAGGAGCAAGAGGAGCTGAGGCGGGAGCTTACTAAGGTGGAAGAAGAAATCCAGACTCTG 180 

34 EEQEELRRELTKVEEEIQTL 53 

181 TCCCAAGTATTGGCCGCAAAAGAGAAGCATCTCGCCGAGCTCAAGCGGAAGGTCGGC^ 240 

54 SQVLAAKEKHLAELKRKLGI 73 

241 TCCTCGCTTCAGGAGTTCAAGCAGAACATTGCCAAAGG6TGGCAAGACGTGACGGCAACC 300 

74 SSLQEFKQNIAKGWQDVTAT 93 

301 AATGCATACAAGAAGACCTCTGAAACTCTATC6CAAGCTGGGCAGAAGGCCTCCGCTGCA 360 

94 NAYKKTSETLSQAGQKASAA113 

361 TTTTCATCGGTTGGCTCAGTCATCACCAAAAAGCTGGAAGACGTGAAAAACTCCCCAACT 420 

114 FSSVGSV ITKKLEDVKNSPT133 

421 TTCAAGTCATTTCAAGAAAAAGTTCAAAAmAAAGTCrAAAGTAGGAG^ 480 

134 FKSFEEKVENLKSKVGGAKP153 

481 GCTGGCGGCGATTTTGGAGAAGTCCTGAATTCCACAGCCAACGCTACCAGTACCATGACC 540 

154 AGGDFGEVLNSTANATSTMT173 

541 ACAGAGCCTCCT(X:ABAACAGATCACABAGAGCCCCT6AGCT1B^ 600 

174 T E P P P E Q M T E S P * 185 

601 GCCCACTGCCAGGTGCTGCCGGCGAGAGCCAAGTACATCTTGACAACGCTCATGGCTGCG 660 

661 GATTrCCACCAGATGTCCTmAmAGCmACTTAmcrrTGACCA^^ 720 

721 AATGAAACAAAGTGAAATCACnGACCTCCACTCCAGGGAAACACTGTTAGCATGCATGG 780 

781 AAGGCCCTTTGTATAGGAAACAGCATCATAGAGCCTCTGGTAGATCCCTGCAGGCAAaA 840 

841 CTGTGTnCTCC nAAAATC ACTGTACATCTGGAnCTAGTnGATCTTTCTTTACTATC 900 

901 TACATGAATCATTGTTmGGGTCT^CTCTACACnAAT(>\AmC^^ 960 

961 TrCTAAA miGGn^ MAAGTCTTCGAAmTTTCATTCCTTTCAAAGSAGAAACTA 1020 

1021 CCAGCTACATTTTTmCTCGGATAAACAGlTCTGTGAGGAC(7^TATC^ 1080 

1081 AGACACCAGACTAAAGTAGACAGGTCTGTATGCAGTTCTATAGTTCTG^ 1140 

1141 ATGCAGACACTCAAACnCCAGTGGGGAGAGTGTGGGTCCTGCTCnGCCTTGGTAACTG 1200 

1201 TCAmG TAGCTACAT CTATnGAGCTCAAATATGCmTCAGmTrTATTATACCATr 1260 

1261 CTCACACATTTTmACAAGAnAAAATTTAAmCAGGTAAAnGAGAGAATAACAm 1320 

1321 TGAGTTAAGTATATGATATTACAGTAAGTTGGAATGmCCACAnCATCACTGATAATT 1380 

1381 CCAAAAGTCTAAACGTCnTAGGTCTATACAGTTATAAAAATGCTAAAAAAAATTCACCA 1440 

1441 TAGGGGAAATTACTGCaCCATTAAATCCAmAACACCm 1500 

1501 TATCAGAAATACAACTTGAATATrrrTTATACTAAGBGATTm^^ 1560 

1561 GCGAGGCGTTACTATGACTGAGaGATCAGGCAGTTTCTGTTCTCAGTGTGnAGTGCCT 1620 

1621 GAGCTGTTCTGTATGTAGAAATCGTTCCCACTCTAAGAACTGTCGGGGCTGTGAGTCAAA 1680 

1681 GCTTCCCAGTGGCTCTGCTAAGCCCCTCTGnAACTGTGGTCACTCCTGACTCACTCCTG 1740 

1741 CnCCrrTGCTGTGTATGTrTATGGCCTATGAGGTTGTATCTGTTACnCTnCTCTAn 1800 

1801 GTGGTTTTACCAGTGTCCATGCCAAATGnAACTGCCAAGCTTGGAGTGACCTAAAGCCT 1860 

1861 TmCAGAGCAIGGCTAGATnAAnGAGGATAAGGTrrCTGCAAACCAGAATTGAAAAG 1920 

1921 CCACAGTGTCGGTTGTCACAAAATGACATGCTGCCAnCCTGGTTGCTGCTCG^^ 1980 

1981 TGGAAACTATGCTTGAmCATGTGAAAATCTTMIMaGTCTCTGTCTCAGIAAAAAAA 2040 

2041 AAAAAAAAAAA 2051 
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